Leveraging Multi-Disciplinary Expertise Throughout the Total Product Life Cycle Having an in-depth understanding of how the ML-enabled medical device will be integrated into the clinical workflow can help ensure that such devices are safe and effective. Developers should rethink the traditional device development process to include inputs from internal stakeholders such as the chief information security officer, privacy and data strategy personnel, and medical personnel. Input from these stakeholders may be needed earlier in the design and development process than is typical for traditional devices.
Implementing Good Software Engineering, Data Quality Assurance, Data Management and Security Practices These practices include methodical risk management and design process designed to capture and communicate design, implementation and risk management decisions and rationale, and to ensure data authenticity and integrity. Developers should also consider FDA’s Content of Premarket Submissions for Management of Cybersecurity in Medical Devices guidance and interoperability of ML-enabled devices within systems or workflows from different manufacturers.
Ensuring Training Data Sets Are Independent of Test Sets Developers should consider sources of dependence (e.g., patient, data acquisition and site factors) and ensure that training datasets and test datasets are appropriately independent of one another. This principle suggests that regulators will expect developers to explain how they separated the training and test sets to control for bias and confounding factors.
Ensuring Selected Reference Datasets Are Based Upon Best Available Methods Developers should use the best available, accepted methods for developing a reference standard to ensure they collect clinically relevant and well-characterized data, and should ensure that they understand the limitations of the reference. Where available, developers should use accepted reference datasets in model development and testing. This may present a hurdle for ML-enabled devices that address disease states or therapeutic areas for which there is no single universally accepted reference standard.
Tailoring Model Design to the Available Data and Reflecting the Intended Use of the Device Model design should be suited to the available data and actively mitigate against known risks (e.g., overfitting, performance degradation, security risks). The Guiding Principles suggest that the regulators may expect developers to provide more detailed information to demonstrate alignment between a product’s proposed intended use and indications for use and the design of the model in terms of mitigating risks and demonstrating efficacy and performance.
Placing Focus on the Performance of the Human-AI Team To the extent the model has a human element, developers should consider human factors and interpretability of model outputs. Considerations that inform traditional device development, such as the impact of human factors, the need for specialized training to use the device, and the expected effect on clinical outcomes (i.e., improvements) and impact on clinical and other user work flows, will be equally important for machine-learning tools.
Demonstrating Device Performance by Testing During Clinically Relevant Conditions Device performance should be evaluated independently of the training data set. Testing performance should consider the intended patient population, clinical environment, human users, measurement inputs and potential confounding factors.
Providing Users With Clear, Essential Information Users should be provided with clear, contextually relevant information, including the product’s intended use and indications for use, information about the model’s performance in relevant subgroups, characteristics of the data used to train and test the model, acceptable inputs, known limitations, how to interpret the user interface and how the model integrates into the clinical workflow. Users also should be apprised of device modifications, updates from real-world performance monitoring, the basis for decision-making, and a way to communicate product concerns to the developers.
Monitoring Deployed Models for Performance and Ensuring Retraining Risks are Managed Developers should monitor deployed models. Additionally, when models are trained after deployment, whether continually or periodically, developers should ensure that there are appropriate controls to manage risks of overfitting, unintended bias or degradation of the model (e.g., dataset rift) that could impact the safety or performance of the deployed model. Developers also should consider how to ensure that the datasets they use to develop and train models will not become stale or outdated over time. The Guiding Principles suggest that regulators will expect developers to consider how changes to real-world clinical assumptions, diagnosis or treatment standards may impact the tool’s performance over its expected lifecycle.
Although the Guiding Principles provide practical, common-sense principles for GMLP, the concepts are not necessarily new. The more challenging task for the regulators and for industry will be developing concrete practices, policies and procedures for ML tools within or alongside the existing framework for medical device quality system regulation in the United States, UK, European Union and other regions.
The Guiding Principles docket, FDA-2019-N-1185, is open for public comment. FDA recently announced that it plans to publish a draft guidance on Marketing Submission Recommendations for A Change Control Plan for Artificial Intelligence/Machine Learning (AI/ML)-Enabled Device Software Functions, as development resources in permit, in current Fiscal Year 2022.