Home / Uncategorized / Rano Karno – images

Rano Karno – images

[_rdp-wiki-embed url=’https://searxng.brihx.fr/search?q=Rano%20Karno&language=all&time_range=day&safesearch=0&pageno=1&categories=images’]
Column content
Use_Case_IDDepartment_CodeAgencyOfficeTitleSummaryDevelopment_StageTechniquesSource_CodeDepartment
DHS-0000-2023DHSCustoms and Border ProtectionAI Curated Synthetic DataAI Curated Synthetic Data creates synthetic data for computer vision to enable more capable and ethical AI when detecting anomalies in complex environments.
Specificallyit creates an emulated X-ray sensor that can produce visually realistic synthetic X-ray scan images similar to real X-ray scan imagesand virtual 3D Assets of vehicles and narcotics containers. These images will be used to enhance the development of Anomaly Detection Algorithms for Non-Intrusive Inspectionincorporating AI/ML for the detection of narcotics and other contraband in conveyances and cargo."InitiationSynthetic Image GenerationDepartment of Homeland Security
DHS-0001-2023DHSCustoms and Border ProtectionAI for Autonomous Situational AwarenessThe AI for autonomous situational awareness system is intended to use IoT sensor kits to covertly detect and track illicit cross-border traffic in remote locations.
The system will leverage a motion image/video system enhanced with Artificial Intelligence that is capable of vehicle detection and direction determination. It will also incorporate a motion sensor thatwhen triggeredwakes up a high-resolution camera to capture a series of pictureswith additional sensors providing confirmation prior to camera capture.
Images captured will be processed by Artificial Intelligence models to classify objectsdetermine vehicle direction at intersectionsand provide imagery sufficient for re-identification. Ultimatelythe systems is intended to create a low footprintlow costlow power system to provide situational awareness and covert detection."Development and AcquisitionMachine VisionDepartment of Homeland Security
DHS-0002-2023DHSCustoms and Border ProtectionAutomated Item of Interest Detection - ICADThe software analyzes photographs that are taken by field imaging equipment, which are then fed into the ICAD system for review by USBP agents and personnel. The Matroid software currently processes and annotates images using proprietary software to determine if any of the images contain human subjects.
Matroid is the name of the Video Computer Aided Detection system used by CBP. It uses trained computer vision models that recognize objectspeopleand events in any image or video stream. Once a detector is trainedit can monitor streaming video in real timeor efficiently search through pre-recorded video data or images to identify objectspeopleand events of interest.
The intent for the ICAD system is to expand the models used to vehiclesand subjects with long-arm rifleswhile excluding items of little or no interest such as animals."Operation and MaintenanceMachine LearningDepartment of Homeland Security
DHS-0003-2023DHSCustoms and Border ProtectionAutonomous AerostatAerostat capability that uses three tethers instead of the traditional single tether, coupled with advanced weather sensors, analytic capabilities, and powerful winches. The AI/ML model is used to detect the need to launch and land based on weather. It also leverages AI and robotics to autonomously launch and recover the aerostat during inclement weather events without the need for on-site staffing, allowing the aerostat to operate autonomously, saving time and manpower.Development and AcquisitionAutomation & RoboticsDepartment of Homeland Security
DHS-0004-2023DHSCustoms and Border ProtectionAutonomous Maritime AwarenessThe Autonomous Maritime Awareness system combines surveillance towers, ocean data solutions, unmanned autonomous surface vehicles (ASV), and AI to autonomously detect, identify, and track items of interest in a maritime environment.
The towers are low-costcustomizableand relocatable surveillance systems. They are equipped with a suite of radars and day/night camera sensors. The ASVs have been ruggedized for the open ocean and are powered by windsolarand/or onboard engine as requiredallowing them to operate in an area of responsibility (AOR) for up to 12 months. Their sensor suite includes cameras and radar.
Both systems use AI/ML to detect and identify objectsdetermine items of interest (IoI) and autonomously track those items using their sensor suites. Once identifiedthese systems can send alerts to monitoring agencies for at-sea interdictions of potential targets and/or intel collections."Development and AcquisitionMachine Learning Automation & RoboticsDepartment of Homeland Security
DHS-0005-2023DHSCustoms and Border ProtectionAutonomous Surveillance Towers (Anduril)Autonomously Detects, Identifies, and Tracks items of interest using Artificial Intelligence integrated with the tower. It does not require a dedicated operator, is rapidly deployable, and is relocatable in less than a day by 2-3 people.
The system features a hybrid command and control capabilityhosted in the government cloudand is accessible via URL by desktoplaptoptabletor Smartphone. It is solar powered with battery backup and requires no accompanying physical infrastructure while providing visibility for 1.5 miles (2.4 km) for people3 miles (4.8km) for vehicles.
The Lattice system permits autonomous detectionidentificationand tracking of Items of Interest (IoIs). The tower scans constantly and autonomously. The radar detects and recognizes movement. The camera slews autonomously to the IoI and the system software identifies the object. The system alerts the user and autonomously tracks the IoI. End users can monitor the system and see near real time photos by logging into the User Interface on any CBP device. "Operation and MaintenanceMachine VisionDepartment of Homeland Security
DHS-0006-2023DHSCustoms and Border ProtectionData and Entity ResolutionAutomates data unification and entity resolution with a high level of trust at enterprise scale and speed.
Data and Entity Resolution uses Machine Learning modeling to ingest multiple data sources and develop models that associate disparate records to identify probable connectionsunique entitiesand/or identify commonalities between multiple independently submitted records.
The automation of entity resolution within the models is supported by a tool that enables non-technical end users to continuously train models through a user-friendly interface. "Operation and MaintenanceNatural Language Processing (NLP)Department of Homeland Security
DHS-0007-2023DHSCustoms and Border ProtectionEntity ResolutionThe third-party global trade data is used to augment and enrich agency’s investigations into entities of interest. It combines data from companies and goods across multiple languages, then provides network analysis to assess trade flows and risks associated with cross-border trade.
This can validate agency-held information or provide better understanding of networks of interest to the agency to better inform investigations that cross borders. AI/ML models help manage the information provided through the softwareincluding behind-the-curtain collection of informationstructuring of dataentity resolutionnetwork analysisrisk analysisand other functions that contribute to the software knowledge graph and frontend that end users interact with. "Development and AcquisitionNatural Language Processing (NLP)Department of Homeland Security
DHS-0008-2023DHSCustoms and Border ProtectionGeospatial imagery utilizing annotationLeverages a commercial constellation of Synthetic Aperture Radar (SAR) satellites with readily available data, capable of imaging any location on Earth, day, and night, regardless of cloud cover.
Utilizes AIincluding machine visionobjectdetectionobject recognitionand annotation to detect airframesmilitary vehiclesand marine vesselsas well as built-in change detection capabilities for disaster response missions."Development and AcquisitionMachine VisionDepartment of Homeland Security
DHS-0009-2023DHSCustoms and Border ProtectionIntegrated Digital EnvironmentThe Integrated Digital Environment provides managers with a better understanding of end user workflows, most and least used applications, and opportunities for improvement.
The AI/ML model applies to end user activity data (e.g.use of applicationsflow between applications) to help CBP identify opportunities for more efficient or effective configuration of interfacesuse of resourcesor development and deployment of CBP’s applications. It tailors analytics and insight generation to allow metrics gatheringusage recording/observationdashboardingand workflow experimentations/suggestions to support analysts utilizing the entire suite of agency and open-source data systems. It also customizes existing capabilities to allow the exact automations needed for agency applications and systemscreating an integrated digital environment for greater connectivity and security between applicationsand better ability for CBP administrators to manage and optimize use of applications by end users."Development and AcquisitionNatural Language Processing (NLP)Department of Homeland Security
DHS-0010-2023DHSCustoms and Border ProtectionRVSS Legacy Overhauled System Project (INVNT)Video Computer Aided Detection (VCAD) (also known as Matroid AI) is software that enables CBP end users to create and share vision detectors.
VCAD detectors are trained computer vision models that recognize objectspeopleand events in any image or video stream. Once a detector is trainedit can monitor streaming video in real timeor efficiently search through pre-recorded video data or images to identify objectspeopleand events of interest.
Users can view detection information via a variety of reports and alert notifications to process and identify important events and trends. Detection data is also available through VCAD's powerful developer Application Programming Interface (API) and language specific clientsso CBP applications can be integrated with the power of computer vision."DeploymentThe Matroid software currently processes and annotates images using proprietary software to determine if any of the images contain human subjects. Future use cases include the potential to detect additional items of interest such as vehicles, subjects with long-arm rifles or large backpacks and to exclude items of little or no interest such as animals.Department of Homeland Security
DHS-0011-2023DHSCustoms and Border ProtectionUse of technology to identify proof of lifeThe Use of technology to identify proof of life, or "Liveness Detection," uses Artificial Intelligence to reduce fraudulent activity, primarily for use within the CBP One app.
The CBP One app is designed to provide the public with a single portal to a variety of CBP services. It includes different functionality for travelersimportersbrokerscarriersInternational Organizationsand other entities under a single consolidated log-inand uses guided questions to help users determine the correct servicesformsor applications needed.
The Liveness Detection component used by the authentication system for the CBP One app uses the user's mobile device camera in addition to Artificial Intelligence algorithms to determine if the face presented to the app is the person in front of the camera at the time of capture and not a photomaskor other spoofing mechanism. Being able to accept submitted data with confidence that the submitting individual is who and where they claim to be is critical to the functionality of the app within the agency environment. "Development and AcquisitionMachine VisionDepartment of Homeland Security
DHS-0012-2023DHSCustoms and Border ProtectionVessel DetectionIntegrated technologies and analytics enhance maritime detection and the sensor network. Machine-assisted and AI-enhanced detection and tracking allows for improved illicit vessel detection in areas with high volumes of legitimate trade and recreational water vessel traffic by increasing situational awareness and responsiveness to threats.
Vessel Detection allows an agent to set a search area with criteria (e.g.peopledronesvehicles) and transmit that criteria to the sensors. Images detected by the sensors are auto-recognized using Artificial Intelligence. The AI algorithms filterdetectand recognize objects and divides them into Items of Interest (IoI) and ""other"" objects.
Detections of IoI are shared with other detection systems while detections of other objects (e.g.animals) are not shared. IoIs can be tracked and maintained across multiple sensors seamlessly."Development and AcquisitionMachine VisionDepartment of Homeland Security
DHS-0013-2023DHSCybersecurity and Infrastructure Security AgencyAdvanced Analytic Enabled Forensic InvestigationCISA deploys forensic specialists to analyze cyber events at Federal Civilian Executive Branch (FCEB) departments and agencies, as well as other State, Local, Tribal, Territorial, and Critical Infrastructure partners. Forensic analysts can utilize advanced analytic tooling, in the form of Artificial Intelligence implementations to better understand anomalies and potential threats. This tooling allows forensic specialists the capabilities to comb through data in an automated fashion with mathematically and probabilistically based models to ensure high fidelity anomalies are detected in a timely manner.InitiationMachine LearningDepartment of Homeland Security
DHS-0014-2023DHSCybersecurity and Infrastructure Security AgencyAdvanced Network Anomaly AlertingThreat hunting and Security Operations Center (SOC) analysts are provided terabytes per day of data from the National Cybersecurity Protection System's (NCPS) Einstein sensors. Manually developed detection alerts and automatic correlation via off the shelf tooling are common, but not comprehensive. Many network attacks can be probabilistically determined given sufficient training data and time. Analysts use automated tooling to further refine the alerts they receive and produce additional automated alerts based on aggregated information and backed in subject matter expertise. This tooling allows CISA analysts the capabilities to comb through data in an automated fashion with mathematically and probabilistically based models to ensure high fidelity anomalies are detected in a timely manner.InitiationMachine LearningDepartment of Homeland Security
DHS-0015-2023DHSCybersecurity and Infrastructure Security AgencyAI Security and RobustnessFrameworks, processes, and testing tools developed to govern the acquisition, development, deployment, and maintenance of AI technologies. Technology integrators within CISA as well as the rest of the federal enterprise use AI-enhanced tools to assure the trustworthy, robust, and secure operation of their AI systems. These tools use Machine Learning and Natural Language Processing to enhance the assessment of AI technology within the agency by speeding up data processing.InitiationMachine Learning, Natural Language Processing (NLP)Department of Homeland Security
DHS-0016-2023DHSCybersecurity and Infrastructure Security AgencyAIS Scoring and FeedbackAIS Automated Scoring & Feedback (AS&F) is uses descriptive analytics from organizational-centric intelligence to support confidence and opinion/reputation classification of indicators of compromise (IOCs). Looking at an indicator AS&F determines if the indicator is present in known-good list by cross-referencing organizational-centric intelligence data of known non-malicious/benign indicators and classifies accordingly if true. If not a known-good, determine if there are sightings of the indicator by cross-referencing organizational-centric intelligence and classify accordingly if true. If there are no sightings for the indicator, determine if this indicator has been verified by an analyst within our organizational-centric intelligence and classify accordingly if true. Lastly if the indicator has not been verified by an analyst, AS&F determines whether there are other reports within our organizational-centric intelligence about this indicator and classifies accordingly. AIS participants can triage against the populated opinion and/or confidence values to identify Indicator objects meeting or exceeding designated criteria and filter out the remaining data. AIS participants may also find value in utilizing the confidence score (if present) and the opinion value to understand whether any difference between the publisher and other organizations exists. Together, these enrichments can help those receiving information from AIS prioritize actioning and investigating Indicator objects.Operation and MaintenanceDescriptive Analysis; Machine Learning; NLPDepartment of Homeland Security
DHS-0017-2023DHSCybersecurity and Infrastructure Security AgencyAutomated Indicator Sharing (AIS) Automated PII DetectionThe Automated PII Detection and Human Review Process incorporates descriptive, predictive, and prescriptive analytics. Automated PII Detection leverages natural language processing (NLP) tasks including named entity recognition (NER) coupled with Privacy guidance thresholds to automatically detect potential PII from within AIS submissions. If submissions are flagged for possible PII, the submission will be queued for human review where the analysts will be provided with the submission and AI-assisted guidance to the specific PII concerns. Within Human Review, analysts are able to confirm/deny proper identification of PII and redact the information (if needed). Privacy experts are also able to review the actions of the system and analysts to ensure proper performance of the entire process along with providing feedback to the system and analysts for process improvements (if needed). The system learns from feedback from the analysts and Privacy experts. Through the incorporation of the automated PII detection, CISA fully compliances with Privacy, Civil Rights and Civil Liberties requirements of CISA 2015 and scaled analyst review of submissions by removing false positives and providing guidance to submission to be reviewed. Through continual audits CISA will maintain integrity and trust in system and human processes.Operation and MaintenanceNatural Language Processing (NLP)Department of Homeland Security
DHS-0018-2023DHSCybersecurity and Infrastructure Security AgencyCritical Infrastructure Anomaly AlertingThe Cyber Sentry program provides monitoring of critical infrastructure networks. Within the program, threat hunting analysts require advanced anomaly detection and machine learning capabilities to examine multimodal cyber-physical data on IT and OT networks, including ICS/SCADA. The Critical Infrastructure Anomaly Alerting model provides AI-assistance in processing this information.InitiationMachine Learning, VisualizationDepartment of Homeland Security
DHS-0019-2023DHSCybersecurity and Infrastructure Security AgencyCyber Incident ReportingCyber incident handling specialists utilize advanced automation tools to process data received through various threat intelligence and cyber incident channels. These tools leverage Machine Learning and Natural Language Processing to increase the accuracy and relevance of data that is filtered and presented to human analysts and decision-makers. Machine Learning techniques also assist to aggregate the information in reports for presentation and further analysis. This includes data received through covered CIRCIA entities.InitiationMachine Learning, Natural Language Processing (NLP)Department of Homeland Security
DHS-0020-2023DHSCybersecurity and Infrastructure Security AgencyCyber Threat Intelligence Feed CorrelationCyber Threat Intelligence Feed Correlation uses AI enabled capabilities to provide accelerated correlation across multiple incoming information feeds. This enables more timely enrichment to improve the externally shared information feeds. AI allows the algorithm to use the information items and results to learn most efficient ways to perform the task. Additionally, tailored algorithms could be created to provided sustained surveillance of threat actor TTPs.
,Initiation,Machine LearningNatural Language Processing (NLP)"Department of Homeland Security
DHS-0021-2023DHSCybersecurity and Infrastructure Security AgencyCyber Vulnerability ReportingVulnerability analysts require advanced automation tools to process data received through various vulnerability reporting channels, as well as aggregate the information for automated sharing. These tools leverage Machine Learning and Natural Language Processing to increase the accuracy and relevance of data that is filtered and presented to human analysts and decision-makers. Machine Learning techniques also assist to aggregate the information in reports for presentation and further analysis. This includes data in the KEV and CVE databases.InitiationNatural Language Processing, VisualizationDepartment of Homeland Security
DHS-0022-2023DHSCybersecurity and Infrastructure Security AgencyMalware Reverse EngineeringReverse engineering of malware, and software analysis more broadly, will continue to be a critical activity in support of CISA’s cyber defense mission. Threat Focused Reverse Engineering (TFRE) leverages advanced engineering, formal methods, and deep learning techniques for better cyber threat intelligence. Without scalable, automated tools, it is difficult to disrupt sophisticated adversaries’ malware development lifecycle. New, unique, automated techniques are needed to better target adversaries, augment analysts, and create sophisticated tools for end users. Core tools disrupt the adversary’s development lifecycle by exposing tactics, techniques, and procedures (TTPs). Analysts could spend more time and energy to hunt/takedown threats; adversaries can spend less time operating malware and must commit more resources to reorient. TFRE consists of a broader development pipeline providing tool hardening, enhanced computational abilities, understanding of deployment environments, and other important capabilities.InitiationMachine LearningDepartment of Homeland Security
DHS-0023-2023DHSCybersecurity and Infrastructure Security AgencyOperational Activities ExplorerDuty officers and analysts in CISA's Operations Center use a dashboard powered by artificial intelligence to enable sensemaking of ongoing operational activities. Artificial intelligence uses new near-real-time event data (from open source reporting, partner reporting, CISA regional staff, and cybersecurity sensors) coupled with historical cybersecurity and infrastructure security information and previous operational response activity to recommend courses-of-action and engagement strategies with other government entities and critical infrastructure owners and operators based on potential impacts to the National Critical Functions.InitiationNatural Language Processing (NLP), Machine Learning, VisualizationDepartment of Homeland Security
DHS-0024-2023DHSCybersecurity and Infrastructure Security AgencySecurity Information and Event Management (SIEM) Alerting ModelsThreat hunting and Security Operations Center (SOC) analysts are provided terabytes per day of log data. Manually developed detection alerts and automatic correlation in Security Information and Event Management tool are common, but not comprehensive. Many cyber attacks can be probabilistically determined given sufficient training data and time. Analysts use automated tooling to further refine the alerts they receive and produce additional automated alerts based on aggregated information and curated subject matter expertise. This tooling allows CISA analysts the capabilities to comb through data in an automated fashion with mathematically and probabilistically based models to ensure high fidelity anomalies are detected in a timely manner.InitiationMachine LearningDepartment of Homeland Security
DHS-0025-2023DHSHQText Analytics for Survey Responses (TASR)Text Analytics for Survey Responses (TASR) is an application for performing Natural Language Processing (NLP) and text analytics on survey responses. It is currently being applied by DHS OCHCO to analyze and extract significant topics/themes from unstructured text responses to open-ended questions in the quarterly DHS Pulse Surveys. Results of extracted topics/themes are provided to DHS Leadership to better inform agency-wide efforts to meet employees’ basic needs and improve job satisfactionOperation and MaintenanceNatural Language Processing (NLP), Latent Dirichlet AllocationDepartment of Homeland Security
DHS-0026-2023DHSHQ, Customs and Border Protection, Cybersecurity and Infrastructure Security Agency, Countering Weapons of Mass Destruction, Immigration and Customs Enforcement, Intelligence and Analysis, Science and TechnologyRelativityOneRelativityOne is a document review platform used to gain efficiencies in document review in litigation, FOIA, and other arenas where large-scale document review and production is necessary.Operation and MaintenanceMachine learning, Continuous Active Learning, ClusteringDepartment of Homeland Security
DHS-0027-2023DHSImmigration and Customs EnforcementNormalization ServicesHSI uses Artificial Intelligence to verify, validate, correct, and normalize addresses, phone numbers, names, and ID numbers to streamline the process of correcting data entry errors, point out purposeful misidentification, connect information about a person across HSI datasets, and cut down the number of resource hours needed for investigations.
Examples of the normalization services provided include: normalizing less well-defined addresses into usable addresses for analysis- (such as those using mile markers instead of a street number); inferring ID type based on user-provided ID value (such as distinguishing a SSN from a DL number without additional context); categorizing name parts while taking into account additional factors (including generational suffixes and multi-part family names); and validating and normalizing phone numbers to the E164 standardincluding their identified county of origin.
These services are provided as part of the Repository for Analytics in a Virtualized Environment (RAVEn). RAVEn is a DHS HSI Innovation Lab project that facilitates largecomplex analytical projects to support ICE’s mission to enforce and investigate violations of U.S. criminalciviland administrative laws. RAVEn also enables tools used to analyze trends and isolate criminal patterns as HSI mission needs arise. For more informationplease read the DHS/ICE/PIA-055 - Privacy Impact Assessment 055 for RAVEn."Operation and MaintenanceMachine LearningDepartment of Homeland Security
DHS-0028-2023DHSImmigration and Customs EnforcementMachine Translation (Previously Language Translator)Systran provides machine translation for over 100 different language combinations. Currently the Innovation Lab has licenses for translating Chinese, Spanish, Arabic, Farsi, Russian, German, Ukrainian and Filipino to English. Systran can translate plain text, word documents, and PDFS. A web-based UI and API endpoint are available.Operation and MaintenanceMachine Learning, Natural Language Processing (NLP)Department of Homeland Security
DHS-0029-2023DHSImmigration and Customs EnforcementEmail AnalyticsThe Email Analytics application enables a user to review and analyze email data acquired through legal process. AI is incorporated to accomplish spam message classification, and named entity recognition (NER) for entity extraction of names, organizations, locations, etc. It also integrates machine translation capabilities using a commercial product.ImplementationMachine Learning, Natural Language Processing (NLP)Department of Homeland Security
DHS-0030-2023DHSImmigration and Customs EnforcementMobile Device AnalyticsMobile Device Analytics (MDA) has been developed to meet the demand on investigators to view and analyze massive amounts of data resulting from court ordered mobile device extractions. The overarching goal of MDA is to improve the efficacy of agents and analysts in identifying pertinent evidence, relationships, and criminal networks from data extracted from cellular phones. Machine Learning is being developed for object detection (such as firearms, drugs, money, etc.) in photos and videos contained in the data.
This is a DHS HSI Innovation Lab / RAVEn project. The Repository for Analytics in a Virtualized Environment (RAVEn) facilitates largecomplex analytical projects to support ICE’s mission to enforce and investigate violations of U.S. criminalciviland administrative laws. RAVEn also enables tools used to analyze trends and isolate criminal patterns as HSI mission needs arise. For more informationplease read the DHS/ICE/PIA-055 - Privacy Impact Assessment 055 for RAVEn."Development and AcquisitionMachine Learning, Object Detection, Natural Language Processing (NLP)Department of Homeland Security
DHS-0031-2023DHSImmigration and Customs EnforcementBarcode ScannerThe Barcode Scanner has been developed to scan and populate detected information into corresponding text fields within the RAVEn GO's Encounter Card. The barcode scanner currently supports MRZ and PDF417 barcode types, frequently found on travel documents (Passport and Passport cards) and US Driver's Licenses. 
This is a DHS HSI Innovation Lab / RAVEn project. The Repository for Analytics in a Virtualized Environment (RAVEn) facilitates largecomplex analytical projects to support ICE’s mission to enforce and investigate violations of U.S. criminalciviland administrative laws. RAVEn also enables tools used to analyze trends and isolate criminal patterns as HSI mission needs arise. For more informationplease read the DHS/ICE/PIA-055 - Privacy Impact Assessment 055 for RAVEn."Operation and MaintenanceMachine Learning, Machine VisionDepartment of Homeland Security
DHS-0032-2023DHSImmigration and Customs EnforcementFacial Recognition ServiceThe Facial Recognition Service is used during investigations conducted by HSI agents and analysts for identification of known individuals, as well as extracting faces for further investigations from perpetrators including child exploitation offenses, human rights atrocities, and war criminals.
This is a DHS HSI Innovation Lab / RAVEn project. The Repository for Analytics in a Virtualized Environment (RAVEn) facilitates largecomplex analytical projects to support ICE’s mission to enforce and investigate violations of U.S. criminalciviland administrative laws. RAVEn also enables tools used to analyze trends and isolate criminal patterns as HSI mission needs arise. For more informationplease read the DHS/ICE/PIA-055 - Privacy Impact Assessment 055 for RAVEn."Operation and MaintenanceMachine Learning, Machine VisionDepartment of Homeland Security
DHS-0033-2023DHSUnited States Citizenship and Immigration ServicesI-485 Family MatchingI-485 Family Matching is designed to create models to match family members to underlying I-485 petitions. The underlying immigrant petition defines if the I-485 is employment-based or family-based. It also has information about the visa classification and priority date which, when compared against the Department of State’s monthly Visa Bulletin, helps predict visa usage. It is difficult to match an I-485 to its underlying immigrant petition, because the only available field on which to match is the A-number. This number is not always present on the immigrant petition, and name/date of birth matching is not as reliable. The goal of I-485 Family Matching is to leverage AI to more confidently create connections between petitioners and their families based on limited data.
Additionallyit will be able to help identify and group I485s filed by family membersas well as gather up the many ancillary forms they may have pending (such as I765I131). Similar to immigrant petition matchingit can be difficult to match up I485s filed by family members. In these cases the only similar fields are a common address. Efforts have been made in the past to identify family members by addressbut it is effective only to a point. The AI model will help make working with this data more reliableas well as group individual petitionerstheir familiesand other helpful associated data together for faster and more accurate processing."Development and AcquisitionMachine Learning, Clustering, RegressionDepartment of Homeland Security
DHS-0034-2023DHSUnited States Citizenship and Immigration ServicesI-539 approval predictionThis project attempts to train and build a machine learning throughput analysis model to predict when an I-539 "Application to Extend or Change Nonimmigrant Status" case will be approved through eProcessing. Allows for some potential improvement for the approval process via eProcessing channel.Development and AcquisitionMachine Learning, ClusteringDepartment of Homeland Security
DHS-0035-2023DHSUnited States Citizenship and Immigration ServicesIdentity Match Option (IMO) Process with DBIS Data MartsThe Identity Match Option (IMO) is used to derive a single identity across multiple systems for each applicant or beneficiary who interacts with USCIS. The IMO aims to aid in person-centric research and analytics.
USCIS maintains a variety of systems to track specific interactions with individuals – benefits case managementappointment schedulingbackground check validationand customer service inquiries. Each system captures its own person-centric data attributes (e.g. SSNA-numberNameDOBaddressetc.) related to individuals interacting with the agency. The identity derivation process uses standard entity matching algorithms included as part of the IMO product to leverage these individual instances of person-centric data attributes to derive identities. The system is able to account for a variety of data formats and potential data quality issues in the source data. The resulting identities are linked back to the original source recordsallowing analysts to see an individual’s comprehensive immigration history with the agencyperform fraud detectionand identify data quality issues requiring resolution."Operation and MaintenanceCriteria based identificationDepartment of Homeland Security
DHS-0036-2023DHSUnited States Citizenship and Immigration ServicesPerson-Centric Identity Services A-Number Management ModelThe vision of Person-Centric Identity Services (PCIS) is to be the authoritative source of trusted biographical and biometric information that provides real-time, two-way visibility between services into an individual's comprehensive immigration history and status. The A-Number Management model ingests person-centric datasets from various source systems for model training and evaluation purposes. The dataset includes biographic information (name, date of birth, Alien #, Social Security #, passport #, etc.) as well as biographic information (fingerprint IDs, eye color, hair color, height, weight, etc.) for model training and matching purposes.
The A-Number Management identifies which records from within our identity database best match search criteria. The model uses machine learning to ensure that search results presented to authorized external partners for external integrations and servicing have a high degree of confidence with the search criteria so that trust in the PCIS entity resolution remains high.
The A-Number Management model plays a critical role in the entity resolution and surfacing of a person and all their associated records. The machine learning models are more capable of resolving ""fuzzy"" matchesand deal with the reality of different data quality."Operation and MaintenanceEnsemble Learning, Machine LearningDepartment of Homeland Security
DHS-0037-2023DHSUnited States Citizenship and Immigration ServicesPerson-Centric Identity Services Deduplication ModelThe vision of Person-Centric Identity Services (PCIS) is to be the authoritative source of trusted biographical and biometric information that provides real-time, two-way visibility between services into an individual's comprehensive immigration history and status. The de-duplication model, ingests person-centric datasets from various source systems for model training and evaluation purposes. Our dataset includes biographic information (name, date of birth, Alien #, Social Security #, passport #, etc.) as well as biographic information (fingerprint IDs, eye color, hair color, height, weight, etc.) for model training and matching purposes.
Critical to the success of PCIS is the entity resolution/deduplication of individual records from various systems of records to create a complete picture of a person. Using machine learningit is able to identify which case management records belong to the same unique individual with a high degree of confidence. This allows PCIS to pull together a full immigration history for an individual without time-consuming research across multiple disparate systems.
The Deduplication model plays a critical role in the entity resolution and surfacing of a person and all their associated records. The ML models are more resilient to fuzzy matchesand deals with the reality of different data fill rates more reliably."Operation and MaintenanceMachine LearningDepartment of Homeland Security
DHS-0038-2023DHSUnited States Citizenship and Immigration ServicesPredicted to NaturalizeThe Predicted to Naturalize model predicts when Legal Permanent Residents would be eligible to naturalize, and attempts to provide a current address. This model could potentially be used to send correspondence to USCIS customers of their resident status, and notify others of potential USCIS benefits.ImplementationMachine Learning, Clustering, RegressionDepartment of Homeland Security
DHS-0039-2023DHSUnited States Citizenship and Immigration ServicesSentiment Analysis - SurveysThe Sentiment Analysis - Surveys system provides a statistical analysis of quantitative results from survey results and then uses Natural Language Processing (NLP) modeling software to assign "sentiments" to categories ranging from strongly positive to strongly negative. This allows survey administrators to glean valuable information from employee satisfaction surveys from both quantitative and qualitative data. This capability is currently available on demand.Operation and MaintenanceR SQL and DatabricksDepartment of Homeland Security
DHS-0040-2023DHSUnited States Citizenship and Immigration ServicesTopic Modeling on Request For Evidence data setsBuilds models that identify lists of topics and documents that are related to each topic. Topic Modeling provides methods for automatically organizing, understanding, searching, and summarizing text data. It can help with the following: discovering the hidden themes in the collection. classifying the documents into the discovered themes.Development and AcquisitionNatural Language Processing (NLP), Machine Learning, ClusteringDepartment of Homeland Security
DOC-0000-2023DOCInternational Trade Administration (ITA)B2B MatchmakingThe system's algorithms and AI technology qualifies data and makes B2B matches with
event participants according to their specific needs and available opportunities. The
systems inputs are data related to event participants and the outputs are suggested B2B
matches between participants and a match strength scorecard."Department of Commerce
DOC-0001-2023DOCInternational Trade Administration (ITA)Chatbot PilotChatbot embedded into trade.gov to assist ITA clients with FAQs, locating information and
contentsuggesting events and services. ITA clients would enter input into the chatbot in
the form of questions or responses to prompts. The chatbot would scan ITA content
libraries and input from ITA staff and return answers and suggestions based on client
persona (exporterforeign buyerinvestor)."Department of Commerce
DOC-0002-2023DOCInternational Trade Administration (ITA)Consolidated Screening ListThe Consolidated Screening List (CSL) is a list of parties for which the United States
Government maintains restrictions on certain exportsreexportsor transfers of items. It
consists of the consolidation of 13 export screening lists of the Departments of
CommerceStateand Treasury. The CSL search engine has “Fuzzy Name Search”
capabilitiesallowing a search without knowing the exact spelling of an entity’s name. In
Fuzzy Name modethe CSL returns a “score” for results that exactly or nearly match the
searched name. This is particularly helpful when searching on CSL for names that have
been translated into English from non-Latin alphabet languages."Department of Commerce
DOC-0003-2023DOCInternational Trade Administration (ITA)AD/CVD Self InitiationThe ADCVD program investigates allegations of dumping and/or countervailing of duties.
Investigations are initiated when a harmed US entity files a petition identifying the alleged
offence and the specific harm inflicted. Self-Initiation will allow ITA to monitor trade
patterns for this activity and preemptively initiate investigations by identifying harmed US
entitiesoften before these entities are aware of the harm."Department of Commerce
DOC-0004-2023DOCInternational Trade Administration (ITA)Market Diversification ToolkitThe Market Diversification Tool identifies potential new export markets using current
trade patterns. A user enters what products they make and the markets they currently
export to. The Market Diversification Tool applies a ML algorithm to identify and compare
markets that should be considered. The tool brings together product-specific trade and
tariff data and economy-level macroeconomic and governance data to provide a picture
of which markets make sense for further market research. Users can limit the markets in
the results to only the ones they want to consider and modify how each of the eleven
indicators in the tool contributes to a country’s overall score. Users can export all the data
to a spreadsheet for further analysis."Department of Commerce
DOC-0005-2023DOCMinority Business Development Administration (MBDA)Azure ChatbotAzure Chatbot is being leveraged to automate and streamline the user response to
potential questions for MBDA users while interacting with the external facing MBDA
website. The solution leverages AI based chatbot response coupled with Machine
Learning and Natural Language Processing capabilities."Department of Commerce
DOC-0006-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Fisheries Electronic Monitoring Image LibraryThe Fisheries Electronic Monitoring Library (FEML) will be the central repository for
electronic monitoring (EM) data related to marine life."Department of Commerce
DOC-0007-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Passive acoustic analysis using ML in Cook Inlet, AKPassive acoustic data is analyzed for detection of beluga whales and classification of the
different signals emitted by these species. Detection and classification are done with an
ensemble of 4 CNN models and weighted scoring developed in collaboration with
Microsoft. Results are being used to inform seasonal distributionhabitat useand impact
from anthropogenic disturbance within Cook Inlet beluga critical habitat. The project is
aimed to expand to other cetacean species as well as anthropogenic noise."Department of Commerce
DOC-0008-2023DOCNational Oceanic and Atmospheric Administration (NOAA)AI-based automation of acoustic detection of marine mammalsTimely processing of these data is critical for adapting mitigation measures as climate
change continues to impact Arctic marine mammals. Infrastructure for Noise and
Soundscape Tolerant Investigation of Nonspecific Call Types (INSTINCT) is command line
software which was developed in-house for model trainingevaluationand deployment
of machine learning models for the purpose of marine mammal detection in passive
acoustic data. It also includes annotation workflows for labeling and validation. INSTINCT
has been successfully deployed in several analysesand further development of detectors
within INSTINCT is desired for future novel studies and automation. Continued integration
of AI methods into existing processes of the CAEP acoustics group requires a skilled
operator familiar with INSTINCTmachine learningand acoustic repertoire of Alaska
region marine mammals."Department of Commerce
DOC-0009-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Developing automation to determine species and count using optical survey data in the Gulf of MexicoVIAME - This project focuses on optical survey collected in the Gulf of Mexico: 1) develop
an image library of landed catch2) develop of automated image processing (ML/DL) to
identify and enumerate species from underwater imagery and 3) develop automated
algorithms to process imagery in near real time and download information to central
database."Department of Commerce
DOC-0010-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Fast tracking the use of VIAME for automated identification of reef fishWe've been compiling image libraries for use in creating automated detection and
classification models for use in automating the annotation process for the SEAMAP Reef
Fish Video survey of the Gulf of Mexico. This work is being conducted in VIAME but we're
looking at several other paths forward in the project to identify best performing models.
Current status is that models are performing well enough that we will incorporate
automated analysis in video reads this spring as part of a supervised annotation-qa/qc
process."Department of Commerce
DOC-0011-2023DOCNational Oceanic and Atmospheric Administration (NOAA)A Hybrid Statistical-Dynamical System for the Seamless Prediction of Daily Extremes and Subseasonal to Seasonal Climate VariabilityDemonstrate the skill and suitability for operations of a statistical- dynamical prediction
system that yields seamless probabilistic forecasts of daily extremes and sub seasonal-to-
seasonal temperature and precipitation. We recently demonstrated a Bayesian statistical
method for post-processing seasonal forecasts of mean temperature and precipitation
from the North American Multi-Model Ensemble (NMME). We now seek to test the utility
of an updated hybrid statistical-dynamical prediction system that facilitates seamless sub
seasonal and seasonal forecasting. Importantlythis method allows for the representation
of daily extremes consistent with climate conditions. This project explores the use of
machine learning."Department of Commerce
DOC-0012-2023DOCNational Oceanic and Atmospheric Administration (NOAA)FathomNetFathomNet provides much-needed training data (e.g., annotated, and localized imagery)
for developing machine learning algorithms that will enable fastsophisticated analysis of
visual data. We've utilized interns and college class curriculums to localize annotations on
NOAA video data for inclusion in FathomNet and to begin training our own algorithms."Department of Commerce
DOC-0013-2023DOCNational Oceanic and Atmospheric Administration (NOAA)ANN to improve CFS T and P outlooksFan Y., Krasnopolsky, V., van den Dool H., Wu, C. , and Gottschalck J. (2021). Using
Artificial Neural Networks to Improve CFS Week 3-4 Precipitation and Temperature
Forecasts."Department of Commerce
DOC-0014-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Drought outlooks by using ML techniquesDrought outlooks by using ML techniques with NCEP models. Simple NN and Deep
Learning techniques used for GEFSv12 to predict Week 1-5 Prcp & T2m over CONUS"Department of Commerce
DOC-0015-2023DOCNational Oceanic and Atmospheric Administration (NOAA)EcoCast: A dynamic ocean management tool to reduce bycatch and support sustainable fisheriesOperational tool that uses boosted regression trees to model the distribution of swordfish
and bycatch species in the California Current"Department of Commerce
DOC-0016-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Coastal Change Analysis Program (C-CAP)Beginning in 2015, C-CAP embarked on operational high resolution land cover
development effort that utilized geographic object-based image analysis and ML
algorithms such as Random Forest to classify coastal land cover from 1m multispectral
imagery. More recentlyC-CAP has been relying on a CNN approach for the deriving the
impervious surface component of their land cover products. The majority of the work is
accomplished through external contracts. Prior to the high-res effortC-CAP focused on
developing Landsat based moderate resolution multi-date land cover for the coastal U.S.
In 2002C-CAP adopted a methodology that employed Classification and Regression Trees
for land cover data development."Department of Commerce
DOC-0017-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Deep learning algorithms to automate right whale photo idAI for right whale photo id began with a Kaggle competition and has since expanded to
include several algorithms to match right whales from different viewpoints (aeriallateral)
and body part (headflukepeduncle). The system is now live and operational on the
Flukebook platform for both North Atlantic and southern right whales. We have a paper in
review at Mammalian Biology."Department of Commerce
DOC-0018-2023DOCNational Oceanic and Atmospheric Administration (NOAA)NN RadiationDeveloping fast and accurate NN LW- and SW radiations for GFS and GEFS. NN LW- and
SW radiations have been successfully developed for previous version of GFSsee: doi:
10.1175/2009MWR3149.1 and the stability and robustness of the approach used was
demonstratedsee: https://arxiv.org/ftp/arxiv/papers/2103/2103.07024.pdf NN LW- and
SW radiations will be developed for the current versions of for GFS and GEFS."Department of Commerce
DOC-0019-2023DOCNational Oceanic and Atmospheric Administration (NOAA)NN training software for the new generation of NCEP modelsOptimize NCEP EMC Training and Validation System for efficient handling of high spatial
resolution model data produced by the new generation of NCEP's operational models"Department of Commerce
DOC-0020-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Coral Reef WatchFor more than 20 years, NOAA Coral Reef Watch (CRW) has been using remote sensing,
modeledand in situ data to operate a Decision Support System (DSS) to help resource
managers (our target audience)researchersdecision makersand other stakeholders
around the world prepare for and respond to coral reef ecosystem stressors
predominantly resulting from climate change and warming of the Earth's oceans. Offering
the world's only global early-warning system of coral reef ecosystem physical
environmental changesCRW remotely monitors conditions that can cause coral
bleachingdiseaseand death; delivers information and early warnings in near real-time to
our user community; and uses operational climate forecasts to provide outlooks of
stressful environmental conditions at targeted reef locations worldwide. CRW products
are primarily sea surface temperature (SST)-based but also incorporate light and ocean
coloramong other variables."Department of Commerce
DOC-0021-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Robotic microscopes and machine learning algorithms remotely and autonomously track lower trophic levels for improved ecosystem monitoring and assessmentPhytoplankton are the foundation of marine food webs supporting fisheries and coastal
communities. They respond rapidly to physical and chemical oceanographyand changes
in phytoplankton communities can impact the structure and functioning of food webs. We
use a robotic microscope called an Imaging Flow Cytobot (IFCB) to continuously collect
images of phytoplankton from seawater. Automated taxonomic identification of imaged
phytoplankton uses a supervised machine learning approach (random forest algorithm).
We deploy the IFCB on fixed (docks) and roving (aboard survey ships) platforms to
autonomously monitor phytoplankton communities in aquaculture areas in Puget Sound
and in the California Current System. We map the distribution and abundance of
phytoplankton functional groups and their relative food value to support fisheries and
aquaculture and describe their changes in relation to ocean and climate variability and
change."Department of Commerce
DOC-0022-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Edge AI survey payload developmentContinued support of multispectral aerial imaging payload running detection model
pipelines in real-time. This is a nine camera (colorinfraredultraviolet) payload controlled
by dedicated on-board computers with GPUs. YOLO detection models run at a rate faster
than image collectionallowing real-time processing of imagery as it comes off the
cameras. Goals of effort are to reduce overall data burden (by TBs) and reduce the data
processing timelineexpediting analysis and population assessment for arctic mammals."Department of Commerce
DOC-0023-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Ice seal detection and species classification in multispectral aerial imageryRefine and improve detection and classification pipelines with the goal of reducing false
positive rates (to 90% accuracy and significantly reducing or
eliminating the labor intensivepost survey review process."Department of Commerce
DOC-0024-2023DOCNational Oceanic and Atmospheric Administration (NOAA)First Guess Excessive Rainfall OutlookMachine Learning Product that is a first guess for the WPC Excessive Rainfall Outlook - It is
learned from the ERO with atmospheric variables. It is for the Day 4-7 products"Department of Commerce
DOC-0025-2023DOCNational Oceanic and Atmospheric Administration (NOAA)First Guess Excessive Rainfall OutlookMachine Learning Product that is a first guess for the WPC Excessive Rainfall Outlook - It is
learned from the ERO with atmospheric variables. It is for the Day 123 products"Department of Commerce
DOC-0026-2023DOCNational Oceanic and Atmospheric Administration (NOAA)CoralNet: Ongoing operational use, improvement, and development, of machine vision point classificationCoralNet is our operational point annotation software for benthic photo quadrat
annotation. Our development of our classifiers has allowed us to significantly reduce our
human annotationand we continue to co-develop (and co-fund) new developments in
CoralNet,,,,Department of Commerce
DOC-0027-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Automated detection of hazardous low clouds in support of safe and efficient transportationThis is a maintenance and sustainment project for the operational GOES-R fog/low stratus
(FLS) products. The FLS products are derived from the combination of GOES-R satellite
imagery and NWP data using machine learning. The FLS productswhich are available in
AWIPSare routinely used by the NWS Aviation Weather Center and Weather Forecast
Offices."Department of Commerce
DOC-0028-2023DOCNational Oceanic and Atmospheric Administration (NOAA)The Development of ProbSevere v3 - An improved nowcasting model in support of severe weather warning operationsProbSevere is a ML model that utilizes NWP, satellite, radar, and lightning data to nowcast
severe windsevere hailand tornadoes. ProbSeverewhich was transitioned to NWS
operations in October 2020is a proven tool that enhances operational severe weather
warnings. This project aims to develop the next version of ProbSevereProbSevere v3.
ProbSevere v3 utilizes additional data sets and improved machine learning techniques to
improve upon the operational version of ProbSevere. ProbSevere v3 was successfully
demonstrated in the 2021 Hazardous Weather Testbed and a JTTI proposal was recently
submitted to facilitate an operational update. The development is funded by GOES-R."Department of Commerce
DOC-0029-2023DOCNational Oceanic and Atmospheric Administration (NOAA)The VOLcanic Cloud Analysis Toolkit (VOLCAT): An application system for detecting, tracking, characterizing, and forecasting hazardous volcanic eventsVolcanic ash is a major aviation hazard. The VOLcanic Cloud Analysis Toolkit (VOLCAT)
consists of several AI powered satellite applications including: eruption detection
alertingand volcanic cloud tracking. These applications are routinely utilized by Volcanic
Ash Advisory Centers to issue volcanic ash advisories. Under this projectthe VOLCAT
products will be further developedand subsequently transitioned to the NESDIS Common
Cloud Frameworkto help ensure adherence to new International Civil Aviation
Organization requirements."Department of Commerce
DOC-0030-2023DOCNational Oceanic and Atmospheric Administration (NOAA)SUVI Thematic MapsThe GOES-16 Solar Ultraviolet Imager (SUVI) is NOAA's operational solar extreme-
ultraviolet imager. The SUVI Level 2 Thematic Map files in these directories are produced
by NOAA's National Centers for Environmental Information in BoulderColorado. These
data have been processed from Level 2 High Dynamic Range (HDR) composite SUVI
images. The FITS file headers are populated with metadata to facilitate interpretation by
users of these observations. Please note that these files are considered to be
experimental and thus will be improved in future releases. Users requiring assistance with
these files can contact the NCEI SUVI team by emailing goesr.suvi@noaa.gov. The SUVI
Thematic Maps product is a Level 2 data product that (presently) uses a machine learning
classifier to generate a pixel-by-pixel map of important solar features digested from all six
SUVI spectral channels."Department of Commerce
DOC-0031-2023DOCNational Oceanic and Atmospheric Administration (NOAA)BANTER, a machine learning acoustic event classifierA supervised machine learning acoustic event classifier using hierarchical random forestsDepartment of Commerce
DOC-0032-2023DOCNational Oceanic and Atmospheric Administration (NOAA)ProbSR (probability of subfreezing roadsA machine-learned algorithm that provides a 0-100% probability roads are subfreezingDepartment of Commerce
DOC-0033-2023DOCNational Oceanic and Atmospheric Administration (NOAA)VIAME: Video and Image Analysis for the Marine Environment Software ToolkitThe Video and Image Analysis for the Marine Environment Software Toolkit, commonly
known as VIAMEis an open-sourcemodular software toolkit that allows users to employ
high-leveldeep-learning algorithms for automated annotation of imagery using a low
code/no code graphical user interface. VIAME is available free of charge to all NOAA
users. The NOAA Fisheries Office of Science and Technology supports an annual
maintenance contract covering technical and customer support by the developerroutine
software updatesbug fixesand development efforts that support broadcross center
application needs."Department of Commerce
DOC-0034-2023DOCNational Oceanic and Atmospheric Administration (NOAA)ENSO Outlooks using observed/analyzed fieldsLSTM model that uses ocean and atmospheric predictors throughout the tropical Pacific
to forecast ONI values up to 1 year in advance. An extension of this was submitted to the
cloud portfolio with the intent of adding a CNN layer that that uses reforecast data to
improve the ONI forecasts."Department of Commerce
DOC-0035-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Using community-sourced underwater photography and image recognition software to study green sea turtle distribution and ecology in southern CaliforniaThe goal of this project is to study green turtles in and around La Jolla Cove in the San
Diego Region-a highly populated site with ecotourism-by engaging with local
photographers to collect green turtle underwater images. The project uses publicly
available facial recognition software (HotSpotter) to identify individual turtlesfrom which
we determine population sizeresidency patternsand foraging ecology"Department of Commerce
DOC-0036-2023DOCNational Oceanic and Atmospheric Administration (NOAA)An Interactive Machine Learning Signals in Passive Acoustic Recordings Toolkit for Classifying Species Identity of Cetacean EcholocationDevelop robust automated machine learning detection and classification tools for acoustic
species identification of toothed whale and dolphin echolocation clicks for up to 20
species found in the Gulf of Mexico. Tool development project funded from June 2018 to May 2021. Tool will be used for automated analyses of long-term recordings from Gulf- wide passive acoustic moored instruments deployed from 2010-2025 to look at environmental processes driving trends in marine mammal density and distribution."Department of Commerce
DOC-0037-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Steller sea lion automated count programNOAA Fisheries Alaska Fisheries Science Center's Marine Mammal Laboratory (MML) is
mandated to monitor the endangered western Steller sea lion population in Alaska. MML
conducts annual aerial surveys of known Steller sea lion sites across the southern
Alaska coastline to capture visual imagery. It requires two full-timeindependent counters
to process overlapping imagery manually (to avoid double counting sea lions in multiple
frames)and count and classify individuals by age and sex class. These counts are vital
for population and ecosystem-based modeling to better understand the species and
ecosystemto inform sustainable fishery management decisionsand are eagerly
anticipated by stakeholders like the NOAA Alaska Regional Officeindustryand
environmental groups. MML worked with Kitware to develop detection and image
registration pipelines with VIAME (updates to the DIVE program to support updated
interface needs). MML is now working to assess the algorithms efficacy and develop a
workflow to augment the traditional counting method (to RL 9)."Department of Commerce
DOC-0038-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Steller sea lion brand sightingDetection and identification of branded steller sea lions from remote camera images in
the western Aleutian IslandsAK. The goal is to help streamline photo processing to
reduce the effort required to review images."Department of Commerce
DOC-0039-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Replacing unstructured WW3 in the Great Lakes with a Recurrent neural network and a boosted ensemble decision treeInvestigated replacing unstructured WW3 in the Great Lakes with (i) a Recurrent Neural
Network (RNNespecially an LSTM) developed by EMC and (ii) a boosted ensemble
decision tree (XGBoost) developed by GLERL. These two AI models were trained on two
decades of wave observations in Lake Erie and compared to the operational Great Lakes
unstructured WW3."Department of Commerce
DOC-0040-2023DOCNational Oceanic and Atmospheric Administration (NOAA)Using k-means clustering to identify spatially and temporally consistent wave systemsPostprocessing that uses k-means clustering to identify spatially and temporally consistent
wave systems from the output of NWPS v1.3. Has been successfully evaluated in the field
by NWS marine forecasters nationwide and has been implemented into operations on
February 32021."Department of Commerce
DOC-0041-2023DOCNational Oceanic and Atmospheric Administration (NOAA)PickyUsing CNN to pick out objects of a particular size from sides scan imagery. Presents users
with a probability that allows for automation of contact picking in the field. Side scan
imagery is simple one channel intensity image which lends itself well to basic CNN
techniques."Department of Commerce
DOC-0042-2023DOCNational Telecommunications and Information Administration (NTIA)Data Science: ClutterNTIA’s Institute for Telecommunication Sciences (ITS) is investigating the use of AI to
automatically identify and classify clutter obstructed radio frequency propagation paths.
Clutter is vegetationbuildingsand other structures that cause radio signal loss through
dispersionreflectionand diffraction. It does not include terrain effects. The classifier is a
convolutional neural network (CNN) trained using lidar data coinciding with radio
frequency propagation measurements made by ITS. This trained CNN can be fed new
radio path lidar data and a clutter classification label is predicted."Department of Commerce
DOC-0043-2023DOCNational Telecommunications and Information Administration (NTIA)WAWENETSThe algorithm produces estimates of telecommunications speech quality and speech
intelligibility. The input is a recording of speech from a telecommunications system in
digital file format. The output is a single number that indicates speech quality (typically
on a 1 to 5 scale) or speech intelligibility (typically on a 0 to 1 scale)."Department of Commerce
DOC-0044-2023DOCUnited States Patent and Trade Office (USPTO)AI retrieval for patent searchAugmentation for next generation patent search tool to assist examiners identify relevant
documents and additional areas to search. System takes input from published or
unpublished applications and provides recommendations on further prior art areas to
searchgiving the user the ability to sort by similarity to concepts of their choosing."Department of Commerce
DOC-0045-2023DOCUnited States Patent and Trade Office (USPTO)AI use for CPC classificationSystem that classifies incoming patent application based on the cooperative patent
classification scheme for operational assignment of work and symbol recommendation for
aI search. Backoffice processing system that uses incoming patent applications as input
and outputs the resulting classification symbols."Department of Commerce
DOC-0046-2023DOCUnited States Patent and Trade Office (USPTO)AI retrieval for TM design coding and Image searchClarivate COTS solution to assist examiner identification of similar trademark images, to
suggest the correct assignment of mark image design codesand to determine the
potential acceptability of the identifications of goods and services. System is anticipated
to use both incoming trademark images and registered trademark images and output
design codes and/or other related images."Department of Commerce
DOC-0047-2023DOCUnited States Patent and Trade Office (USPTO)Enriched CitationData dissemination system that identifies which references, or prior art, were cited in
specific patent application office actionsincluding: bibliographic information of the
referencethe claims that the prior art was cited againstand the relevant sections that
the examiner relied upon. System extracts information from unstructured office actions
and provides the information through a structured public facing API."Department of Commerce
DOC-0048-2023DOCUnited States Patent and Trade Office (USPTO)Inventor Search Assistant (iSAT)Service to help inventors "get started" identifying relevant documents, figures, and
classification codes used to conduct a novelty search. System takes a user entered short
description of invention and provides a user selectable set of recommended documents
figuresand classification areas."Department of Commerce
DOE-0000-2023DOEBrookhaven National LaboratoryAutomated sorting of high repetition rate coherent diffraction data from XFELS"Coherent X-rays are routinely provided today by the latest Synchrotron
and X-ray Free-electron Laser Sources. When these diffract from a
crystal containing defectsinterference leads to the formation of a
modulated diffraction pattern called """"speckle"""". When the defects move
aroundthey can be quantified by a correlation analysis technique called
X-ray Photon Correlation Spectroscopy. But the speckles also change
when the beam moves on the sample. By scanning the beam in a
controlled waythe overlap between the adjacent regions gives
redundancy to the datawhich allows a solution of the inherent phase
problem. This is the basis of the coherent X-ray ptychography method
which can achieve image resolutions of 10nmbut only if the probe
positions are known.
The goal of this proposal will be to separate """"genuine"""" fluctuations of a
material sample from the inherent beam fluctuations at the high data
rates of XFELs. Algorithms will be developed to calculate the
correlations between all the coherent diffraction patterns arriving in a
time seriesthen used to separate the two sources of fluctuation using
the criterion that the """"natural"""" thermal fluctuations do not repeatwhile
beam ones do. We separate the data stream into image and beam
"modes"""" automatically."""Department of Energy
DOE-0001-2023DOEBrookhaven National LaboratoryMachine Learning for Autonomous Control of Scientific User FacilitiesBNL will work alongside SLAC, to implement ML algorithm(s) into NSLS-
II Operations to interpret accelerator data more intelligently. We intend
to train said algorithms with 5+ years of archived device-data from
accelerator componentsrecords of previous fault causes (to connect to
data-symptoms) and stored beam current."Department of Energy
DOE-0002-2023DOEBrookhaven National LaboratorySMMMAI/ML is being used to evaluate measurements in real-time during
simultaneous experiments on two beamlines and then drive subsequent
data collection on both of the beamlines to maximize the scientific value
generated per time."Department of Energy
DOE-0003-2023DOEFermi National AcceleratorAI DenoisingThis program aims to develop generative models for quickly simulating
showers of particles in calorimeters for LHC experiments"Artificial Intelligence, Big Data, Neural Networks, Hierarchical Generative ModelDepartment of Energy
DOE-0004-2023DOEFermi National AcceleratorExtreme data reduction for the edgeThis projects develops AI algorithms and tools for near-sensor data
reduction in custom hardware."Artificial Intelligence, Big Data, Neural Networks, Novel Spectroscopic TechnologyDepartment of Energy
DOE-0005-2023DOEFermi National AcceleratorHigh-Velocity AI: Generative ModelsThis project has two parts: 1. generating adversarial examples and then
using domain adaptation and other techniques to improve the
robustness of AI classification algorithms against those attacks
(focusing on astrophysics/cosmology applications); 2. using AI
algorithms to improve the output of low-quality classical simulation
engines to deliver a high-quality result at high speed."Artificial Intelligence, Big Data, Neural Networks, Hierarchical Generative ModelDepartment of Energy
DOE-0006-2023DOEFermi National Acceleratorhls4mlThis project develops hardware-software AI codesign tools for FPGAs
and ASICs for algorithms running at the extreme edge."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0007-2023DOEFermi National AcceleratorIn-pixel AI for future tracking detectorsThis project explores novel AI-on-chip technology for intelligent
detectors embedded with sensing technology"Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0008-2023DOEFermi National AcceleratorIn-storage computing for multi- messenger astronomy in neutrino experiments and cosmological surveysThis project aims to address the big-data challenges and stringent time
constraints facing multi-messenger astronomy (MMA) in neutrino
experiments and cosomological surveys. Instead of following the
traditional computing paradigm of moving data to the compute
elementsit does the opposite to embed computation in the data where
processing is performed in situ. This will be achieved through emerging
computational storage accelerators on which ML algorithms may be
deployed to execute MMA tasks quickly so alerts can be disseminated
promptly."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0009-2023DOEFermi National AcceleratorMachine Learning for Accelerator Operations Using Big Data Analytics / L-CAPEBig data analytics for anomaly prediction and classification, enabling
automatic mitigationoperational savingsand predictive maintenance of
the Fermilab LINAC"Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0010-2023DOEFermi National AcceleratorMachine Learning for Linac Improved PerformanceIn Linacs at FNAL and J-PARC, the current emittance optimization
procedure is limited to manual adjustments of a few parameters; using
a larger number is not practically feasible for a human operator. Using
machine learning (ML) techniques allows lifting this restriction and
expanding this set. Our goal is to integrate ML into linac operation - and
in particular RF control to achieve a more optimal longitudinal emittance
and lower overall losses."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0011-2023DOEFermi National AcceleratorNext-Generation Beam Cooling and Control with Optical Stochastic CoolingThis program leverages the physics and technology of optical stochastic
cooling (OSC) to explore new possibilities in beam control and sensing.
The planned architecture and performance of a new OSC system at
IOTA should enable turn-by-turn programmability of the high-gain OSC.
This capability can then be used in conjunction with other hardware
systems as the basis of an action space for reinforcement learning (RL)
methods. The program aims to establish a new state of the art in beam
cooling and a flexible set of tools for beam control and sensing at
colliders and other accelerator facilities."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0012-2023DOEFermi National AcceleratorREADS: Real-time Edge AI for Distributed SystemsThis project will develop and deploy low-latency controls and prediction
algorithms at the Fermilab accelerator complex"Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0013-2023DOEFermi National AcceleratorSimulation-based inference for cosmologyThis project will develop and use simulation-based inference to estimate
cosmological parameters related to cosmic acceleration in the early and
late universe — via the cosmic microwave background and strong
gravitational lensingrespectively. This will produce an analysis pipeline
that can be deployed for next-generation cosmic surveys."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0014-2023DOEFermi National AcceleratorSONIC: AI acceleration as a serviceThis project focuses on integration of AI hardware for at-scale inference
acceleration for particle physics experiments."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0015-2023DOEFermi National AcceleratorStreamining intelligent detectors for sPHENIX/EICThis project develops real-time algorithms for event filtering with tracking
detectors for nuclear physics collider experiments."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0016-2023DOEFermi National AcceleratorUncertainty Quantification and Instrument Automation to enable next generation cosmological discoveriesThis project will develop AI-based tools to enable critical sectors for near-
future cosmic applications. Uncertainty quantification is essential for
performing discovery science nowand simulation-based inference
offers a new approach. The automated design and control of
instrumentation will be important for improving the efficiency of planning
and executing cosmic experiments."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0017-2023DOEIdaho National LaboratoryDeep Learning Malware Analysis for reusable cyber defenses.The INL uses machine learning (feed forward neural network) on a large
data set of translated malware binaries in graph structures to identify
commonality between malware."Department of Energy
DOE-0018-2023DOEIdaho National LaboratoryGeo Threat Observable for structure cyber threat related to the energy sectorCollection of open source threat inforamtion related to cyber issues in
the energy sectorcollected stored in graphdb and used in machine
learning for similarities of threat enabling better reuse of cyber
protections."Department of Energy
DOE-0019-2023DOELawrence Livermore National LaboratoryAdvanced energy, batteries, and industrial efficiencyLeveraging data science to navigate design space for better batteries
and energy storage as well as scale up of various technologies"Department of Energy
DOE-0020-2023DOELawrence Livermore National LaboratoryAdvanced materials science, engineering, and exploration relevant to the other key technology focus areasEnabling machine learning based technology to specialized materials for
superior performance for scientific research and manufacturing systems"Department of Energy
DOE-0021-2023DOELawrence Livermore National LaboratoryAI/ML and other software advancesModel architecture development research, including workflows,
algorithm and performance optimization"Department of Energy
DOE-0022-2023DOELawrence Livermore National LaboratoryBiology, genomics, and synthetic biologyCombining experimental and computational methods to perform
fundamental and applied research in genomicsmolecular toxicology
nanotechnologyhost–pathogen biologystructural biologygenetics
microbial systemsand medical countermeasures"Department of Energy
DOE-0023-2023DOELawrence Livermore National LaboratoryCyber security, data storage, and data management technologiesData-processing pipelines and user interfaces to process and
aggregate largebulkand possibly unstructured datasets allowing for
search and export of data for further analysis in secure way"Department of Energy
DOE-0024-2023DOELawrence Livermore National LaboratoryHigh-performance computing, semiconductors, and advanced computer hardwareNovel computer hardware architecture/configurations that can perform
at the edge and/or in harsh environments"Department of Energy
DOE-0025-2023DOELawrence Livermore National LaboratoryInnovation methods, processes and promising practices that can affect the speed and effectiveness of innovation processes at scale.Computational approaches that lead to faster insights into the
development and deployment of large scale operations"Department of Energy
DOE-0026-2023DOELawrence Livermore National LaboratoryNatural and anthropogenic disaster prevention and mitigationLeveraging a broad, multimodal data stream to predict and understand
natural disaster scenarios for the purposes of prevention and mitigation"Department of Energy
DOE-0027-2023DOELawrence Livermore National LaboratoryQuantum computing and information systemsMachine learning and quantum computing applied towards optimization,
quantum chemistrymaterial scienceand cryptography"Department of Energy
DOE-0028-2023DOELawrence Livermore National LaboratoryRobotics, automation, and advanced manufacturingAI is being used for accelerating hardware development and
interpretation of sensor data to improve process reliability"Department of Energy
DOE-0029-2023DOENational Energy Technology LaboratoryAdvanced Image SegmentationU-Net CNN segmentation to isolate pore and fluid from computed
tomography scans of multiphase transport in cores."Neural Networks, OtherDepartment of Energy
DOE-0030-2023DOENational Energy Technology LaboratoryAdvanced model to forecast offshore landslide risks and marine geohazardsThis research will use data and models from the Offshore Risk Modeling
(ORM) with intelligent databasesartificial intelligence (AI)/MLbig data
and other advanced computing technologies to address offshore
subsurface natural-engineered system challengessuch as
characterization and mapping of geologic hazardssafe operations
equipment reliabilityand environmental assessments."Big Data, Natural Language Processing, OtherDepartment of Energy
DOE-0031-2023DOENational Energy Technology LaboratoryAI used to interpret sensor data.AI is being used to classify sensor data. An AI algorithm was written
and trained with a wide range of known sensor conditions to enable
automatic classification of sensor data into likely constituent gas
concentrations."OtherDepartment of Energy
DOE-0032-2023DOENational Energy Technology LaboratoryAI/ML may be needed to extract data from text, image and tabular- based resources. NEWTS is partnering with university teams to use ML to fill in data gaps using predictive models.NEWTS data requirements and database structure needs will be
established by reviewing datasets and literature on energy-water
streams. Data sources will be identified from regulatory agencies
government monitoring programsas well as open-source literature.
Metadata of each source will be compiled into a data catalog for
tracking and reference. Datasetsincluding high-quality composition
data for relevant streamswill be collected and downloaded. Acquired
data will be processed into a structured format based on the
prioritization of datasets to be included in NEWTS. Data acquisition and
processing might entail the application of ML (e.g.natural language
processing) to efficiently resurrect data trapped in historical reports
(e.g.PDFs) or other unstructured formats. One research product of this
subtask will be a release of the data catalogwhich will be made
available on"Natural Language Processing, OtherDepartment of Energy
DOE-0033-2023DOENational Energy Technology LaboratoryAI/ML methodology for rapid design of sorbents tuned to specific ash impoundment and/or landfill requirements.Computation of the descriptors (atomic property-weighted radial
distribution functions) that will be used for the ML portion of the task;
Fitting of a machine-learned model for the prediction of B sorption;
Optimization and computational design of a sorbent for maximum
sorption of B as a function of B concentration in the aqueous solution;
Force field generation for an additional pollutant (if needed); Sorption
calculations and ML fitting for the second pollutant (TBD); Optimization
and computational design of a sorbent for maximum sorption of the
second pollutant as a function of pollutant concentration in the aqueous
solution."OtherDepartment of Energy
DOE-0034-2023DOENational Energy Technology LaboratoryAnalysis to Assess Offshore CCS Trends and GapsProviding expertise, input, and support for the development of a DOE
(NETL/FECM) carbon storage technical resources catalog that
facilitates searching for information about datasetsmodels and tools
publications and reportsand competencies resulting from DOE-
FECM/NETL’s offshore and CSP activities. this project will complete a
review and analysis of knowledge and data resources resulting from
international offshore CCS projects. Outcomes of this analysis are
expected to include the integration of key data and tools into the EDX-
hosted Open Carbon Storage Database and DisCO2ver platform (in
development via the EDX4CCS FWP)as well as geo-data science
based analysis and recommendations on geologic and metocean
insights from international studies and their alignment or relevance to
U.S. Federal offshore settings."OtherDepartment of Energy
DOE-0035-2023DOENational Energy Technology LaboratoryANN Submodels of Reaction PhysicsANN development of flow physics for code accelerationOtherDepartment of Energy
DOE-0036-2023DOENational Energy Technology LaboratoryComputational capabilities to support experimental effortsThis subtask will leverage NETL’s in-house computational capabilities
and existing university collaborators to support experimental efforts by
providing atomic-level DFT and microkinetic modeling calculations for
catalyst systems. This work provides atomic-level details on reaction
energetics and establishes key structure-property relationships used to
optimize catalyst structure and formulation."Department of Energy
DOE-0037-2023DOENational Energy Technology LaboratoryComputational methods for the characterization of CO2 chemisorption in amine- functionalized MOFs.Databases of MOFs will be screened using computational methods to
identify promising MOFs. Software will be further developed to allow for
the addition of desirable functional groups (amines) to metal centers
and/or ligands of MOFs. The team will calculate the reaction enthalpy for
CO2 sorption in amine functionalized MOFs and further computational
methods for the characterization of CO2 chemisorption in amine-
functionalized MOFs will be developed."OtherDepartment of Energy
DOE-0038-2023DOENational Energy Technology LaboratoryCreation of polymer datasets and inverse design of polymers with targeted backbones having High CO2 permeability and high CO2/N2 selectivity.Machine learning models were developed to predict CO2 permeability
and CO2/N2 selectivity of polymers. Novel methods were developed to
generate polymer datasets. Furthermorea novel machine learning
technique is being developed to inverse design the polymers that will
have targeted properties."OtherDepartment of Energy
DOE-0039-2023DOENational Energy Technology LaboratoryData discovery, processing, and generation using machine learning for a range of CCS data and informationThe team will focus on supporting ongoing geospatial data collection
and publishing efforts leveraging the new EDX++ cloud computer
capabilities through ArcGIS Enterprise Portal. The use of Arc Enterprise
Portal will support the development of the Carbon Matchmaker toolas
well as support the release of a new version of GeoCubewhich will be
host to the updated Carbon Storage Open Database and NATCARB
completed in EY21. NETL is supporting DOE-FECM in developing and
releasing a survey and map for the Carbon Matchmakera tool
developed to enable stakeholders to self-identify carbon dioxide related
activities (productionutilizationstoragedirect air captureand
infrastructure/transportation) to identify and connect stakeholders and
support national collaborative opportunities. The ArcGIS Enterprise
Portal will be leveraged to build out a new version of GeoCube with the
migration of hundreds of spatial data layers into the new platform. The
migration of data to an Arc Enterprise based GeoCube will enable
easier version control for data integration and curation."Big Data, Natural Language Processing, OtherDepartment of Energy
DOE-0040-2023DOENational Energy Technology LaboratoryData platform to expedite access and reuse of carbon ore data for materials, manufacturing and researchData platform to expedite access and reuse of carbon ore data for
materialsmanufacturing and research. Assembled using data science
NLP methodsand hosted in virtualmulti-cloud platform for online
analytics."Natural Language Processing, OtherDepartment of Energy
DOE-0041-2023DOENational Energy Technology LaboratoryDatabase will be utilized to demonstrate targeted biocide strategies using AI to assess large DNA datasets.The team will develop a public DNA database that will advance
knowledge in produced water management. This project consists of two
phases: (1) the development and launching of the databaseand (2) the
demonstration of applicability of the database by conducting a network
analysis. The work will be pursued as defined in the phases below. The
fully characterized streams will be used by other FWPs to estimate
overall resource recovery and will be used by other FWPs as training
set for machine learning (ML) models to predict compositions when only
limited measurements can or have been completed for the produced
water."Big Data, OtherDepartment of Energy
DOE-0042-2023DOENational Energy Technology LaboratoryDemonstrate how ML-based approaches can help operators during active injection and post- injection monitoringTo demonstrate how ML-based approaches can help operators during
active injection and post-injection monitoringit is necessary to
understand their needs and identify how ML-based approaches can
potentially meet or support those needs. Task 4 will establish data-
sharing protocols between SMART and the operator to create an
exchange mechanism that is not intrusive to the operator and provides
updates from ML results designed to enhance the operator decision
process. Demonstrate application of ML-based approaches to improve
site-monitoring and operations efforts performed during injection and
post-injection phasese.g.using IL-ICCS dataand developing value of
information guidelines."OtherDepartment of Energy
DOE-0043-2023DOENational Energy Technology LaboratoryDemonstrate the robust performance of our ML method in a commercial-scale synthetic data and integrate image-to-image mapping with convolutional neural networksOur method quickly incorporates streaming observations for accurate
and timely forecasts with uncertainty quantificationtaking reservoir
simulation data as inputs and incorporating real-time observation
streams for accuratetimely geological carbon storage forecasts.
Computation effort is distributed over many machinesfacilitates
coupled inversions using many ML modelsand allows for ML-Driven
optimization and sensitivity analysis"Neural Networks, OtherDepartment of Energy
DOE-0044-2023DOENational Energy Technology LaboratoryDevelop and demonstrate reinforcement learning approach for time-varying control for flexible hydrogen and power production.Efforts on IES control will include the development of a dynamic
optimization-based nonlinear model predictive control (NMPC)
framework. NMPC approaches for optimizing cell thermal management
and maximizing IES efficiency under set-point transition will be
developed for flexible operation. Reinforcement learning (RL)
approaches will also be developed for optimal control policy selection
and learning-based adaptive control. There are opportunities for
improved learning through interaction with the electrolyzer in addition to
learning from the MPC action. Multi-policy approaches will be developed
for controlindependently by RL or in concert with MPCor even for
scheduling the operating policy. The ultimate goal is to develop
operational strategies and an NMPC and RL control framework for
optimizing IES performance under flexible hydrogen and power
production scenarioswhile minimizing physical and chemical
degradation over long-term operation."OtherDepartment of Energy
DOE-0045-2023DOENational Energy Technology LaboratoryDevelop fast predictive models using novel machine-learning based methods.Accurate, fast predictive ML models form the foundation for the virtual
learning platform. Generating training data then developing ML based
models enables a Virtual Learning Environment (VLE) for exploring and
testing strategies to optimize reservoir developmentmanagement &
monitoring prior to field activities."OtherDepartment of Energy
DOE-0046-2023DOENational Energy Technology LaboratoryDevelop, integrate, and automate the reduction of CFD models while preserving acceptable levels of accuracy. In general for CCSI2, this work intends to focuse on CFD applications.Will leverage state-of-the-art, physics-based deep learning (DL) models
to learn generalizable surrogates that may be used in place of CFD
models to predict quantities required for downstream optimization. The
products from this subtask can be immediately leveraged by other
subtasks that are seeking to speed up their CFD simulation models to
streamline their downstream analyses. Addtionallyimprovements to the
ML/AI interface in FOQUS. Includes support for vector variables in the
ML/AI plugin and support for additional surrogate model tools (e.g.
PyTorchSci-kit Learn) and additional normalization function forms in
the ML/AI plugin."Neural Networks, OtherDepartment of Energy
DOE-0047-2023DOENational Energy Technology LaboratoryDevelopment of AI/ML methodsDevelop quality, reliability, and version control standards for SMART
software. Continue development of AI/ML methods for use by the 2A
and 2C activitiesincluding Modeling anomalies due to local
heterogeneity coupled with an enhanced capacitance-resistance model
(CRM) and Bayesian Belief Network (BBN) modeling integrated with
geochemistry. Continue development of advanced computational
approaches with modeling using the most advanced general purpose
PDE/ODE physics-informed neural network (PINN) tool developed by
NVIDIA and accelerate training PINNs using Wafer Scale Engine (WSE)
by Cerebras Systems Inc."OtherDepartment of Energy
DOE-0048-2023DOENational Energy Technology LaboratoryDevelopment of new machine learning-based process modeling capabilities that assess the viability and efficiency, with uncertainty quantification, of the chemical processes involved in the carbon fiber production and its output qualityProvide sub-pilot-scale verification of lab-scale developments on the
production of isotropic and mesophase coal-tar pitch (CTP) for carbon
fiber productionusing coals from several U.S. coal-producing regions.
An extensive database and suite of tools for data analysis and economic
modelingwith an associated web-based community portalwill be
developed to relate process conditions to product qualityand to assess
the economic viability of coals from different regions for producing
specific high-value products."Artificial Intelligence UnknownDepartment of Energy
DOE-0049-2023DOENational Energy Technology LaboratoryDOE AI Data Infrastructure SystemLeveraging generative AI and cloud enabled data infrastructure to
improve CCS user experience and connectivity producing an adaptive
user interface that streamlines connection of CCS stakeholders to what
matters to them."Artificial Intelligence, Big Data, OtherDepartment of Energy
DOE-0050-2023DOENational Energy Technology LaboratoryFluid migration from well-to-well communication will be inputted in AI to determine a costs-benefit analysisThis project will develop an ML algorithm to predict the time when a
growing fracture will reach the monitored well. The ML workflow will be
trained on the distinctive tensile strain signature that precedes the
growing fracture. The new workflow will be designed to work in
conjunction with the fracture warning ML workflow developed in EY21.
Togetherthese workflows will: (1) provide an early warning of well-to-
well communication(2) predict the measured depths where the
communication will happenand (3) provide an estimated time until the
beginning of well-to-well communication."Artificial Intelligence, Big Data, OtherDepartment of Energy
DOE-0051-2023DOENational Energy Technology LaboratoryGeochemically Informed Leak Detection (GILD)A Bayesian Belief Network has been developed to interogate the altered
geochemistry around a potential CO2 leakage site. The use of the BNN
and site specific parameters will reduce the percentage of false
positives with this method."Artificial Intelligence, OtherDepartment of Energy
DOE-0052-2023DOENational Energy Technology LaboratoryInitial case study using regulatory compliance (well integrity testing, fluid compositionali data, geographic, and geologic information from oil and gas wells in the Wattenberg Field, Denver Basin, central Colorado, USAResearchers will apply artificial intelligence/machine learning (AI/ML)
techniques to national-scale well characterization and integrity test
datasets to yield new insights into leakage potential."OtherDepartment of Energy
DOE-0053-2023DOENational Energy Technology LaboratoryMachine learning based identification of current hazardous offshore metocean and bathymetric conditions that can impact safe offshore energy operationsBuild off user testing and further refine analytical logic to develop
Version 2 of the OGA smart tool for release on EDX. Continue
refinements to offshore hazard modelsincluding wave and turbidity
current models. Draft manuscripts detailing the OGA Tool models and
algorithms. Assemble a metocean and seafloor database for release
with the OGA Tool Version 2 online; strategize web-hosted versions of
the OGA Tool and database."Big Data, Neural Networks, OtherDepartment of Energy
DOE-0054-2023DOENational Energy Technology LaboratoryMachine Learning for geophysical data inversionUse machine learning to generate synthetic seismic and gravity data,
and data driven inversion for leak detection"OtherDepartment of Energy
DOE-0055-2023DOENational Energy Technology LaboratoryMachine learning for legacy well evaluationUse machine learning to identify common attributes that correlated to
well integrity issues to prioritize for monitoring and remediation."OtherDepartment of Energy
DOE-0056-2023DOENational Energy Technology LaboratoryMachine learning to process multi- model data and information to aid in the identification of undocumented orphaned wellsUse of machine learning to process and analyze trends and patterns in
known well data to predict undocuemnted orphaned wellsas well as
machine learning approached to process different imagery based data
to further classify and characterize additional undocuemented orphaned
wells within the Appalachain Basin"Big Data, OtherDepartment of Energy
DOE-0057-2023DOENational Energy Technology LaboratoryMachine learning to refine and analyze data for CCS needsUtilze and apply different machine learning approaches to process data
and generate new derivative data products that help address CCS
stakeholder data-needs for resource evaluationrisk assessment
supply chainsocial and environmental justice evaluationsregulatory
complianceand more."Big Data, OtherDepartment of Energy
DOE-0058-2023DOENational Energy Technology LaboratoryMachine learning to tool and model applications for CCS needsUtilze and apply different machine learning approaches to help model
and analyze Class VI well regulatation dataCCS infrastructure
optimizationCCS data visualizationand interaction with “really big”
(petabyte-scale) datasets used for CCS resource characterization and
risk reduction (e.g.reflection seismic surveys) within the EDX multi-
cloud ecosystem."Big Data, OtherDepartment of Energy
DOE-0059-2023DOENational Energy Technology LaboratoryML-based approaches to improve site characterization effortsDemonstrate application of ML-based approaches to improve site-
characterization efforts performed during the pre-injection phase using
data from either IBDP (for which data are currently available) or other
opportunistic field demonstration or commercial projects (for which data
may become available) and develop value of information guidelines.
Demonstrate how ML-based rapid forecasting can be used to help with
pre-injection reservoir management decisions under data uncertainties.
Demonstrate how a visualization platform with ML-based models can"OtherDepartment of Energy
DOE-0060-2023DOENational Energy Technology LaboratoryML-based proxy models and multi- level data driven fracture network imaging to support rapid decision making.ML-based proxy-models of fracture network, HF geometry, HF
propertiesbottomhole pressure and drainage volume contribute to
fracture networkproduction forecast and well drainage volume
visualizations."OtherDepartment of Energy
DOE-0061-2023DOENational Energy Technology LaboratoryML-based reduced order models of reservoir response to Co2 injection into saline and/or hydrocarbon- bearing formations - as the basis for integrated assessment modeling of leakage risk (e.g., SACROC)Generally, the approach used by NRAP researchers to address these
questions is to develop a robustscience-based integrated assessment
framework that links fast forecasting models of CO2 storage system
components (e.g.storage reservoir; leakage pathways including wells
faultsand fractured caprock; intermediate formations; and receptors of
concernincluding groundwater aquifers and the atmosphere).
Superimposed on this system model are various fit-for-purpose
analytical capabilities that support analyses in support of stakeholder
decision making for questions related to site-specific risk evolutionrisk-
based area of review delineationconformance assessmentand post-
injection site monitoring
In Task 2.0researchers will augment and expand this functionality to
demonstrate relevance to industry-standard site risk management
methods (i.e.bowtie analysis framework) and to understand
containment performance and leakage risk for scenarios where a site
transitions from CO2 utilization for EOR to dedicated CO2 storage. To
ensure that risk assessment efforts are informative to real geologic
storage deployment scenariosNRAP researchers will engage with a
diverse set of stakeholders to establish an appropriate modeling and
risk assessment design basis."OtherDepartment of Energy
DOE-0062-2023DOENational Energy Technology LaboratoryNatural Language ProcessingInformation and articles on energy storage will be gathered and
reviewed. Developed natural language processing (NLP) algorithms will
be used to help categorize and understand various energy storage
efforts in the R&D communities. Additionallytrends within the
discovered and selected topical focus areas in energy storage will be
examined. This will provide a view of energy storage R&Dwhich is not
biased or limited to known search terms."Big Data, Natural Language Processing, OtherDepartment of Energy
DOE-0063-2023DOENational Energy Technology LaboratoryNeural networks used to compensate a drone-mounted magnetic sensor for maneuvering of the drone.Electromagnetic technology development and optimization for cased
wells. Scalable solutions—getting to 100000 wells/year through drone
technology and ML technology. NETL will develop ML algorithms to
compensate magnetic data for the maneuvering of drone aircraft.
Magnetic noise can limit sensitivity of detection and resolution of
anomalies in the magnetic data. The ML algorithms will reduce attitude-
and heading-induced noise in drone magnetic surveys."Neural Networks, OtherDepartment of Energy
DOE-0064-2023DOENational Energy Technology LaboratoryOnline real time system IdentificationWork will focus on using SI to monitor the condition of a power plant
boiler at different process states. SI algorithms will be implemented
within an MPC to provide continuous adaptability as the power plant
ramps through the entire range of operating loads. Once the control
algorithm has been developed to be effective on representative models
it will be tested on a high-fidelity commercial power plant simulator or on
a real power plant facility. The online SI techniques will be tested on
historical power plant datadynamic models (including a power plant
simulator)power generating equipment including laboratory pilot-scale
power systemsand on power plants where feasible."Artificial Intelligence, Big Data, OtherDepartment of Energy
DOE-0065-2023DOENational Energy Technology LaboratoryPrediction of gasification gas yield and compositions using machine learningA machine learning (ML) model will be developed to aid in investigating
and optimizing of gasification with various feedstocks like waste plastic
waste coalbiomass and MSW. Database on the gasification will be
built from main resources of literatureprior experiments in NETLand
new generating experiments in NETL. Al/ML will be a part of the project.
It combines with experimental study to accelerate development of
gasification applying to variour feedstocks including waste plastics
waste coalMSW and its mixture. The ML will have more impact as the
big database will be built."Big Data, OtherDepartment of Energy
DOE-0066-2023DOENational Energy Technology LaboratoryReduce computational cost of CFD simulations that screen for more efficient intensified solvent contactor geometries.Collaborate with Subtask 4.3 Machine Learning Support to reduce the
computational complexity of validated CFD calculations using Deeper
Fluids (DF)graph neural networks (GNNs)or similar ML approaches.
Further development of ongoing process modeling/optimization
ultimately informed by the CFD reduced order models (ROM) will also
be a focus."Neural Networks, OtherDepartment of Energy
DOE-0067-2023DOENational Energy Technology LaboratoryRokbase Geologic Core Data ToolThis project will develop the platform through which the DOE OGFL data
are easily accessiblesearchableand describedenabling future R&D
sustainable resource planningand responsible stewardship of the
team’s national resources. NETL’s expertise in developing geo-data
scienceMLvisualizationonline data mining and integrationand
advanced analytics through scientific computing (including high
performance computing and big data computing methods) and
virtualized environments can be leveraged to support further intelligent
analytics for offshore systems."Neural Networks, OtherDepartment of Energy
DOE-0068-2023DOENational Energy Technology LaboratorySolving Field Equations on the Wafer Scale EngineThe intent is to develop a collocated, finite volume code to allow
maximum mesh flexibility and support advanced CFD capabilities found
in modern CFD codes like FluentOpenFOAMand MFiX.
NETL will take a metered approach to development towards a fully
reacting CFD capability on the WSE. EY22 will be filled with API
capability expansions needed to support general purpose CFD
applicationssuch as general purpose finite volume formulations
collocated grid capabilities (Rhie & Chow Interpolation)bit stuffing to
save memory when dealing with cell typesgeneral purpose boundary
conditionsetc. In additionthe code will be benchmarked in a series of
tests towards a fully reacting CFD capability that will support problems
of interest to FECM."Big Data, OtherDepartment of Energy
DOE-0069-2023DOENational Energy Technology LaboratoryTo drive insights on the dependencies between the natural gas and electricity sectors to increase reliability of the NG systemCommercially available models will be used to generate predictive
scenarios"Big DataDepartment of Energy
DOE-0070-2023DOENational Energy Technology LaboratoryTo drive insights on the power system reliability, cost, and operations during the energy transition with and without FECM technologiesCommercially available models will be used to generate predictive
scenarios"Big DataDepartment of Energy
DOE-0071-2023DOENational Energy Technology LaboratoryTo accelerate discovery of protection system and laser processing of protective coatings on CMC for hydrogen turbines.The objectives of this project are to design, process, and validate a
laser-manufacturedintegratedand graded bond coat-environmental
barrier coat-thermal barrier coat (BC-EBC-TBC) system that can
effectively protect and lead to the use of Silicon Carbide fiber/Silicon
Carbide (SiCf/SiC) matrix CMCs in next-generation hydrogen-fueled
turbines."Artificial Intelligence UnknownDepartment of Energy
DOE-0072-2023DOENational Energy Technology LaboratoryTo accurately predict alloy & component performance extrapolated to conditions where experimental results to do not exist.AI/ML will be used to interrogate databases comprised of experimental
dataliterature dataand synthetic data generated improved physics
based models to generate reduced order models to accurate predict
materials the performance of materials and components under extreme
environments (temperatureatmosphere) and complex loading (cyclical
triaxial) for long service life durations."Big Data, OtherDepartment of Energy
DOE-0073-2023DOENational Energy Technology LaboratoryTo analyze data and derive insights and improve predictions to forecast wellbore kick events to reduce loss of control events.Use of neural networks and/or AI cluster data analysis methods to
improve detection and forecasting of wellbore and drilling related loss of
control eventsknown as kicksto imrpove real-time detection and
prediction of these conditions."Neural Networks, OtherDepartment of Energy
DOE-0074-2023DOENational Energy Technology LaboratoryTo apply machine learning applications to map carbon ore, rare earth element, and critical mineral resourcesTo identify information gaps, GIS and machine learning applications will
be used to map carbon orerare earth elementand critical mineral
resource infrastructureand market data in consultation with NETL
geospatial modeling activities. Research needs and technology gaps will
be assessedand resources targeted for sampling and characterization.
This effort will provide a complete Northern Appalachian carbon ore
rare earth elementand critical mineral value chain basinal assessment
to enable quick development of commercial projects."Artificial Intelligence UnknownDepartment of Energy
DOE-0075-2023DOENational Energy Technology LaboratoryTo apply machine learning and data analytics techniques to integrated subsurface datasets to predict key reservoir properties and compare various fields across the area of study and to correlate vintage data with new data and address the distribution of fractures and vugs.Laboratory experiments will be used to optimize a CO2 flood
composition specific to HTD rock propertiesand subsequently design
and simulate injection scenarios that offer wettability alterationfoaming
and reduced surface tension. This work will improve oil recovery from
matrix porosity and mitigate the impact of fracture zones. The optimized
design will be implemented and tested in a Trenton/Black River field.
The results will provide strategies to improve oil recovery in complex
carbonate formations in the Michigan Basin as well as in other
carbonate plays."Artificial Intelligence, Big DataDepartment of Energy
DOE-0076-2023DOENational Energy Technology LaboratoryTo apply machine learning methods to explore the inter-well uncertainty in the Goldsmith Landreth San Andres Unit and to update reservoir models.Engineered water can lower interfacial tension and minimize capillary
forces that gravity can push the oil up and out of the matrix. This
proposal is to test this technology in the field scalein Goldsmith
Landreth San Andres Unit. Apply history matching of flexible interface-
based reservoir models and ML methods such as generative
adversarial networks that provide new methods to explore the inter-well
uncertainty and to update the reservoir models."Artificial Intelligence UnknownDepartment of Energy
DOE-0077-2023DOENational Energy Technology LaboratoryTo automate development of proxy models for power generation combustion systems.Detailed CFD of large combustion systems will be performed. From
the resultsmachine learning will be used to develop fast proxy models
which can will provide results close to the CFD resultsbut in a small
fraction of the time. These fast models will then be used in real-time
digital twin models of the power plantwhich can be used to help the
power plant operator to spot instrumentation failures or cyberattacks on
the plant."OtherDepartment of Energy
DOE-0078-2023DOENational Energy Technology LaboratoryTo automate RDE image analysis, machine learning for RDE image analysis is being employed.The expected outcome of this project will be extensive experimental
data that can provide valuable insight in RDC designcoupling RDC with
turbomachinerymodel validationand next generation combustion
sensors that use artificial intelligence and computer vision. Design
of an optimized inlet to maximize pressure gain in an RDE relies on an
understanding of the coupling between the inlet plenums (fuel and air)
the combustor annular channeland the exhaust diffusor. This creates a
challenge for CFD as the models are significant and computationally
expensive. NETL is continuing a collaboration with the University of
Michigan to accelerate reacting flow CFD modeling using machine
learning (ML)."OtherDepartment of Energy
DOE-0079-2023DOENational Energy Technology LaboratoryTo build the first data analytics and artificial intelligence field laboratory for unconventional resources in the Powder River Basin, focusing on optimization of hydraulic fracture stimulations through the use of multiple diagnostic technologies.To establish a tight oil Field Laboratory in the Powder River Basin and
accelerate the development of three major unconventional oil resources
through detailed geologic characterization and improved geologic
models leading to significant advances in well completion and fracture
stimulation designs specific to these three formations. Utilize multi-
variate analysis to understand the interrelationship between completion
and stimulation controls on well productivity."Artificial Intelligence, Big DataDepartment of Energy
DOE-0080-2023DOENational Energy Technology LaboratoryTo create a data-driven multiscale phytotechnology framework for identification and remediation of leached-metals-contaminated soil.The project objectives are to integrate satellite remote sensing, machine
learning and image processinggeological engineering modelsand soil
science and plant pathology to: 1) identify potential leaching of metals
from coal ash impoundments (Phase I)and 2) propose locally
adaptable phytoextraction approaches to remediate contaminated
regions (Phase II)."Artificial Intelligence UnknownDepartment of Energy
DOE-0081-2023DOENational Energy Technology LaboratoryTo create and apply machine learning algorithms to predict carbon dioxide enhanced oil revoery improvements with rich gas in the Bell Creek Field and other selected fields.Create models with ML algorithms to predict CO2 EOR improvements
with rich gas in the Bell Creek Field and other selected fields. The
results of these models will be compared with the predictions of CMG’s
reservoir simulations models."Artificial Intelligence UnknownDepartment of Energy
DOE-0082-2023DOENational Energy Technology LaboratoryTo create reduced order models for predicting long term performance degradation behavior of fuel cells and electrolyzers.Machine learning algorithms are being used to analyze large datasets of
microstructural and perfromance degradation simulations of various
electrode microstructres to develop reduced order models that can be
used for long-term perfromance degradation predictions of large area
fuel cell/electrolysis cells and cell stacks. The reduced order models can
be used for dynamic simulations that can more accurately mimic the
changing loading conditions of the modern grid."Big Data, OtherDepartment of Energy
DOE-0083-2023DOENational Energy Technology LaboratoryTo demonstrate multi-gamma based sensor technology for as-fired coal property measurementApplying an advanced multigamma attenuation (MGA) sensor to
accurately and precisely measure coal properties at the point of
injection into burners.
One research objective is to perform MGA testing and databases
development for neural network developed fingerprinting of coal
properties. This will include neural network refinement with MGA data
and to upgrade Microbeam’s Combustion System Performance Indices
(CSPI) – CoalTracker (CT) program with MGA-based neural network
algorithms."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0084-2023DOENational Energy Technology LaboratoryTo deploy dynamic neural network optimization to minimize heat rate during ramping for coal.The primary objective of the proposed work is to 1) deploy dynamic
neural network optimization (D-NNO) to minimize heat rate during all
phases of operation (rampinglow loadand high load) at a coal power
plant. The project will build a high-fidelitysystems-leveldynamic model
of the plant for a rapid prototyping environment for the D-NNO and to
allow researchers to better understand the dynamic phenomena that
occur during ramping and at various plant loadsand Commercialize D-
NNO as a readily-available software application by working with an
industry-proven software platform. The plant will be perturbed over time
to allow machine learning (ML) models to be fitted to the plant’s
response data."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0085-2023DOENational Energy Technology LaboratoryTo design, develop, and demonstrate an AI-integrated physics-based attack resilient proactive system.Enable "defense-in-depth" cyber-physical system (CPS) security and
resiliency for the distribution grid. The recipient will designdevelopand
demonstrate a vendor-agonistic scalable Artificial Intelligence Integrated
Attack-Resilient Proactive System (AI-ARPS) for utility distribution grid
systems including advanced distribution management system (ADMS)
and DER management system (DERMS) applications."Artificial Intelligence UnknownDepartment of Energy
DOE-0086-2023DOENational Energy Technology LaboratoryTo design, proto-type and demonstrate a miniaturized implementation of a multi-process, high-spatial-resolution monitoring system for boiler condition management.Project will develop control logic for automated control of bituminous
coal-fired boiler. Plant operational data will be compared against
monitoring data to determine when different sensor output from a
miniaturized high temperature multi-processhigh-spatial-resolution
monitoring system signifies damaging conditions in that region of the
boilerand what operational changes can be made to eliminate the
damaging condition. The control logic will be developed for automated
control of soot-blowing and other boiler operations"Department of Energy
DOE-0087-2023DOENational Energy Technology LaboratoryTo detect leaks and creaks.The relevant research has been focused on demonstrating applicability
of novel machine learning based approaches to two major challenges
associated with safe management of large-scale geologic CO2 storage
operationsearly detection of leaks (i.e.by detecting small leaks) and
early detection of induced seismicity (i.e. by detecting small seismic
signals)."Artificial Intelligence UnknownDepartment of Energy
DOE-0088-2023DOENational Energy Technology LaboratoryTo develop 5G integrated edge computing platform for efficient component monitoring in coal-fired power plantsDevelop an on-demand distributed edge computing platform to gather,
processand efficiently analyze the component health data in coal-fired
power plants. Given that edge computing servers are closer to the field
devices in modernized power plantsthe efficiency of edge computing
service with respect to dynamic orchestrationresource data collection
and health information monitoring will be investigated for timely detection
of remote faults and to perform diagnosis."Big DataDepartment of Energy
DOE-0089-2023DOENational Energy Technology LaboratoryTo develop a deep-learning Artificial Intelligence model for analysis of fundamental combustion characteristicsA deep-learning Artificial Intelligence model will be pursued for rapid
analysis of detailed fundamental combustion characteristics that support
the design and troubleshooting process of H2-containing fuel combustor
development."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0090-2023DOENational Energy Technology LaboratoryTo develop a general drag model for assemblies of non-spherical particles created with artificial neural networksThe project plans to develop a more accurate artificial neural network
(ANN)-based method for modeling the momentum exchange in fluid-
solid multiphase mixtures to significantly improve the accuracy and
reduce the uncertainty of multiphase numerical codes andin particular
of MFiXby developing and providing a general and accurate method for
determining the drag coefficients of assemblies of non-spherical
particles for wide ranges of Reynolds numbersStokes numbersand
fluid-solid properties and characteristics. The research team will achieve
this goal by conducting numerical computations with a validated in-
house CFD code and using artificial intelligence methods to develop an
ANN that will be implemented in TensorFlow and linked with the MFiX
code."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0091-2023DOENational Energy Technology LaboratoryTo develop a novel platform for secure data logging and processing in fossil fuel power generation systems using blockchain and machine learning to reduce down time for fossil energy power plants, limit reductions of power and reduce cost for repairs.Machine learning model development will consist of traditional machine
learning and deep learning algorithms implementation for anomaly
detection. Machine learning server will be used to develop the
traditional models using One-Class Support Vector Machine (SVM) and
K-Mean Clustering and deep learning models using Recurrent Neural
Network (RNN) and its various implementations like Long Short-Term
Memory (LSTM)Gated Recurrent Unit (GRU)Generative Adversarial
Network (GAN)and Autoencoders using the sensor data collected from
secure sensor network."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0092-2023DOENational Energy Technology LaboratoryTo develop a wireless, distributed data acquisition and interpretation system foe seismic monitoring and carbon storage characterization.Resensys plans to develop a wireless, distributed data acquisition and
interpretation system tailored for monitoring and characterization of
seismic activity at carbon storage sites. The seismicity data collected in
real time during the CO2 storage site characterization and sequestration
processes combined with advanced signal processing and Artificial
Intelligence and Machine Learning (AI/ML) methodologies provide an
understanding of natural seismicity risks prior to any CO2 injectionprior
to making large investments in developing the storage project."Artificial Intelligence UnknownDepartment of Energy
DOE-0093-2023DOENational Energy Technology LaboratoryTo develop an AI-driven integrated autonomous robotic visual inspection (RVI) platform.The overall objective of the research is to develop an AI-driven
integrated autonomous robotic visual inspection (RVI) platform that can
perform real-time defect identificationdynamic path planningand safe
navigation in a closed-loop manner. The"Artificial Intelligence, Robotic Processing Automation (RPA)Department of Energy
DOE-0094-2023DOENational Energy Technology LaboratoryTo develop an Artificial intelligence- based model for rotating detonation engine designsAn artificial intelligence-based model will be used to develop low-loss
rotating detonation engine (RDE) designs for use in power generation
using natural gas/syngas mixtures. The model formulation will enable full-
scale RDE calculations over 100-1000 detonation cycles."Artificial Intelligence UnknownDepartment of Energy
DOE-0095-2023DOENational Energy Technology LaboratoryTo develop and create an autonomous robotic inspection system.The goal of the project is to prevent negative environmental and
socioeconomic impacts of coal waste (coal ash and tailings) by
developing an aerial robot-enabled inspection and monitoring system of
active and abandoned coal ash and tailings storage facilities. The first
objective of this project is the development of a programmable drone
equipped with several complementary sensorsthat will autonomously
inspect several structures of a storage facility. The second objective of
this project is to create artificial intelligence-based hazard detection
algorithms that will use multispectral and georeferenced images (i.e.
thermal and visual) and 3D Point Clouds data collected by an
autonomous drone to detect hazards in the storage facility structure that
would indicate uncontrolled leakage to the environment or lead to the
potential failure of the structure."Artificial Intelligence, Robotic Processing Automation (RPA)Department of Energy
DOE-0096-2023DOENational Energy Technology LaboratoryTo develop and demonstrate drone- based geophysical and remote- sensing technologies to quantify critical minerals (CM).To develop and demonstrate drone-based geophysical and remote-
sensing technologies to quantify critical minerals (CM) in coalcoal
relatedunconventional and secondary sources or energy related waste
streams. Drone-based geophysical surveys and remote sensing
combined with artificial intelligence/machine learning (AI/ML) analytics
for real-time integration and analytics has potential to transform
characterization and monitoring for CM from conventional and
secondary resources."Artificial Intelligence, Robotic Processing Automation (RPA)Department of Energy
DOE-0097-2023DOENational Energy Technology LaboratoryTo develop and evaluate a general drag model for gas-solid flows via physics-informed deep machine learningThe project will evaluate the performance of several ANN algorithms for
machine learningpertinent to the deep neural network (DNN)
algorithms. The DNN candidates will include random forest (RF)BPNN
XGBoostand other supervised deep neural network algorithms. The
best DNN algorithm will be identified by ranking of these algorithms’
performance. The Recipient will integrate the deep learning ANN model
(DNN model) into the multiphase flow simulation software MFiX-DEM
which is part of the NETL’s open source CFD suite of software MFiX.
The DNN based drag model developed on TensorFlow will be
implemented using NETL’s existing software links between MFiX and
TensorFlow."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0098-2023DOENational Energy Technology LaboratoryTo develop and validate sensor hardware and analytical algorithms to lower plant operating expenses for the pulverized coal utility boiler fleetThe objective is to develop and validate sensor hardware and analytical
algorithms to lower plant operating expenses for the pulverized coal
utility boiler fleet. The focus is on relatively inexpensive new “Internet of
Things” technologies to minimize capital investment. Three technologies
will be explored for demonstration and full-scale testing in a coal-fired
power plant. The first focuses on gas and steam temperature control
issues at low load. The second uses sensors and analytic algorithms for
monitoring coal pulverizer operation at lower loads to reduce the
minimum firing capability of coal burners. The third investigates new
sensors and advanced controls to better balance air and fuel at each
burner enabling reduction in the minimum firing capability of coal
burners."Department of Energy
DOE-0099-2023DOENational Energy Technology LaboratoryTo develop artificial intelligence- enabled tools (ArtIT) for cyber hardening of power grids.To develop a novel resiliency framework for power grids by integrating
different theoriessuch as closed-loop controlssecurityagilityformal
reasoning and synthesismachine learningand laboratory setup
demonstration. The framework will provide enhanced resiliency to wide-
area control operations in cyberattacks."Artificial Intelligence UnknownDepartment of Energy
DOE-0100-2023DOENational Energy Technology LaboratoryTo develop drag models for non- spherical particles through machine learningProduce comprehensive experimental and numerical datasets for gas-
solid flows in well-controlled settings to understand the aerodynamic
drag of non-spherical particles in the dense regime. The datasets and
the gained knowledge will train deep neural networks to formulate a
general drag model for use directly in NETL MFiX-DEM module. This will
help to advance the accuracy and prediction fidelity of the computational
tools that will be used in designing and optimizing fluidized beds and
chemical looping reactors"Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0101-2023DOENational Energy Technology LaboratoryTo develop high fidelity tools which run in near real time not only help in the field to guide and optimize complex operations but can be used as digital twinsTo develop high fidelity tools which run in near real time not only help in
the field to guide and optimize complex operations but can be used as
digital twins for cyber security and cyber-physical modeling."Big DataDepartment of Energy
DOE-0102-2023DOENational Energy Technology LaboratoryTo develop innovative biomonitoring and remediation of heavy metals using phytotechnologies.The objective of the work is to utilize algal- and cyanobacterial-based
phycotechnologies to address pervasive heavy metal contamination
from coal combustion product (CCP) impoundments at the Savannah
River Site. Novel bioindicators will be developed to gauge the potential
for phytoremediation to restore legacy impoundment sites."Artificial Intelligence UnknownDepartment of Energy
DOE-0103-2023DOENational Energy Technology LaboratoryTo develop low cost conversion of coal to grapheneDemonstrate the techno-economical feasibility of a 250 ton/day
manufacturing facility to convert coal to high-quality graphene. The core
technology is based on flash joule heating (FJH) to convert various
coals to graphene. Machine learning algorithms will map out the
correlation of processing parameters with the final product (graphene
yieldqualitydimensions)."Natural Language Processing, Neural NetworksDepartment of Energy
DOE-0104-2023DOENational Energy Technology LaboratoryTo drive insights on emissions from natural gas production, storage, and transmission to determine how best to reduce emissionsAI/ML will be used to recognice patterns in well integrity records that
could predict failure events"Big Data, OtherDepartment of Energy
DOE-0105-2023DOENational Energy Technology LaboratoryTo drive insights on environmental performance of the natural gas system to inform effective mitigation strategiesLife Cycle Analysis models will be used to define and estimate
environmental parameters/performance"Big Data, OtherDepartment of Energy
DOE-0106-2023DOENational Energy Technology LaboratoryTo drive insights on pipeline maintenance and repair strategies to reduce incidents of pipeline leakage; support evaluation of use and reuse strategiesML will be used to develop a pipeline risk assessment geospatial model
and support evaluation of use and reuse opportunities."Big Data, OtherDepartment of Energy
DOE-0107-2023DOENational Energy Technology LaboratoryTo drive insights on water recovery from cooling tower plumesStudy of plume formation and collection on mechanical (induced) draft
cooling towerspartly in a high-fidelity controlled environment and partly
on a full-scale industrial cooling tower. It will start by building the needed
laboratory setup and installing various sensors on the lab cooling tower.
At the same time a computational fluid dynamics (CFD) model will be
implemented to get precise full-scale plume models. Using the insights
into power-plant plume characteristics the project will iterate on and
experimentally test electrodes and collectorswhich make up modular
panelson the lab cooling tower. What has been learned from the full-
scale plume modeling and sensor data analysis will then be applied to
develop a design model to build the optimal collection apparatus for
given working conditions"Department of Energy
DOE-0108-2023DOENational Energy Technology LaboratoryTo drive insights through data-driven predictive modeling to forecast the remaining lifespan and future risk of offshore production platforms.An Artificial Neural Network and Gradient Boosted Regression Tree
were developed and applied to predict the remaining lifespan of
production platforms. These big data-driven models resulted in
predictions with scored accuracies of 95–97%."Artificial Intelligence, Big Data, Neural Networks, OtherDepartment of Energy
DOE-0109-2023DOENational Energy Technology LaboratoryTo drive insights using machine learning-based dynamics, control, and health models and tools developed by NETL to gain valuable operational data, insights, andML will be used to develop dynamics, controls, and health models for
operating power generation facilities"OtherDepartment of Energy
DOE-0110-2023DOENational Energy Technology LaboratoryTo employ machine learning to study the dependence of electrochemical performance on microstructural detailsWith a significant number of images. The Recipient will build deep
learning methods at the object detection stage using the Region Based
Convolutional Neural Network (RCNN) or You Only Look Once (YOLO)
class of algorithmsthe heart of which is a deep learning image
classifier. Deep learning algorithms will also be built using convolutional
layers followed by residual layers to extract feature vector descriptors in
the second stage. In the third and fourth stages of affinity and
associationa recurrent neural network approach can be used to build a
tracker. All of these approaches require a large training set that will
enable sophisticated models to be built to handle the complexity of the
application.
With a limited number of images. In the case that there is are a limited
number of imagesthe Recipient will still be able to follow the processing
pipeline. The recipient will determine a suitable approachwith
concurrence from the project manager. Two potential approaches
include:
• Transfer learning: training the image classifier in the object detector on
images of similar quality and appearanceand
• Match filtering: detectionfeature extractionand matching based on
traditional image processing and computer vision techniques."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0111-2023DOENational Energy Technology LaboratoryTo enhance the SimCCS toolset to better account for existent infrastructure and to more broadly engage other user bases to improve toolset performance and applicability.Continue development of the SimCCS toolset, which is utilized to
determine optimal placement for CO2 pipeline rights of way (ROW) and
infrastructure in a machine-learning driven methodology that that
considers environmentally sensitive areasJustice40 considerations
and utilization of existent infrastructure."Artificial Intelligence UnknownDepartment of Energy
DOE-0112-2023DOENational Energy Technology LaboratoryTo evaluate current infrastructure throughout a study area and evaluating future infrastructure needs to accelerate the deployment of CCUSOne key task focuses on evaluating current infrastructure throughout the
Initiative study area and evaluating future infrastructure needs to
accelerate the deployment of CCUS. LANL will utilize its unique
technologies for this project focusing on SimCCSwith a minor
consulting role using NRAP and machine learning algorithms."Artificial Intelligence UnknownDepartment of Energy
DOE-0113-2023DOENational Energy Technology LaboratoryTo expore and analtze hydrogen- fueled rotating detonation engines using advanced turbulent combustion modeling and high- fidelity simultion tools.(1) analysis of injector design effects on RDE parasitic combustion; (2)
understanding the impact of RDE ignition mechanism and initial
transients on the ensuing detonation wave behavior; (3) deployment and
assessment of machine learning assisted turbulent combustion models
for predictive and computationally-efficient RDE CFD simulations; and
(4) development of a highly scalable high-order CFD modeling
framework for scale-resolving simulations of full-scale RDEs and
investigation of TCI and wall boundary layer effects.(1) analysis of
injector design effects on RDE parasitic combustion; (2) understanding
the impact of RDE ignition mechanism and initial transients on the
ensuing detonation wave behavior; (3) deployment and assessment of
machine learning assisted turbulent combustion models for predictive
and computationally-efficient RDE CFD simulations; and (4)
development of a highly scalable high-order CFD modeling framework
for scale-resolving simulations of full-scale RDEs and investigation of
TCI and wall boundary layer effects."Artificial Intelligence UnknownDepartment of Energy
DOE-0114-2023DOENational Energy Technology LaboratoryTo fill critical data gaps in big data analytics and machine learning applications to inform decision making and improve the ultimate recovery of unconventional oil and natural gas resources.Project will conduct numerical analysis of all-digital pressure sensing
technology will be used to create a synthetic dataset with downhole
pressure sensor readings for each stage and will be analyzed
statistically with DA to integrate with software."Artificial Intelligence, Big DataDepartment of Energy
DOE-0115-2023DOENational Energy Technology LaboratoryTo help automate data discovery and preparations to support a range of CS models, tools, and productsAI & ML are used to help collect and process data from multipel sources
to further integrate and characterize infromation to provide additional
data and infromation to support a range of carbon storage work"Big Data, Natural Language Processing, OtherDepartment of Energy
DOE-0116-2023DOENational Energy Technology LaboratoryTo help automate data integration and exploration for geologic core properties related information.Using natural language processing, deep learning neural networks, and
possibly tensor flow for image analytics."Big Data, Natural Language Processing, OtherDepartment of Energy
DOE-0117-2023DOENational Energy Technology LaboratoryTo identify and characterization REE- CM hot zones using machine learning-aided multi-physics.Develop and field demonstrate a machine learning (ML)-aided multi-
physics approach for rapid identification and characterization of REE-
CM hot zones in mine tailings with a focus on coal and sulfide mine
tailings or other processing or utilization byproductssuch as fly ash and
refuse deposits."Artificial Intelligence UnknownDepartment of Energy
DOE-0118-2023DOENational Energy Technology LaboratoryTo implement boiler health monitoring using a hybrid first principles-artificial intelligence modelDevelop methodologies and algorithms to yield (1) a hybrid first-
principles artificial intelligence (AI) model of a PC boiler(2) a physics-
based approach to material damage informed by ex-service component
evaluationand (3) an online health-monitoring framework that
synergistically leverages the hybrid models and plant measurements to
provide the spatial and temporal profile of key transport variables and
characteristic measures for plant health."Artificial Intelligence UnknownDepartment of Energy
DOE-0119-2023DOENational Energy Technology LaboratoryTo implement machine learning to predict aerodynamic and combustion characteristics in hydrogen turbineDesign rules and reduced models will be formulated by combining high
fidelity simulations of chemically reacting flowstochastic modeling
techniquesreduced modeling through machine learning and testing of
injector configurations. These can be used in an industrial setting to
predict the aerodynamic and combustion characteristics in hydrogen
turbine combustors based upon which design decisions are made."Artificial Intelligence UnknownDepartment of Energy
DOE-0120-2023DOENational Energy Technology LaboratoryTo implement novel SSC-CCS sensing technology and associated condition-based monitoring (CBM) software for improved understanding of the boiler tube failure mechanismsA preliminary condition-based monitoring (CBM) package with graphic
user interface (GUI) will be developed. This CUI will allow the operators
to view the current and historical signals of temperature profiles of the
boiler tube at specific sensor locations. Combining the pre-existing
conditions and the opinions from designers/operators/experts’
experiencesthe system will be integrated with EPRI’s Boiler Failure
Reduction Program to provide assessments on the health conditions of
the boiler tubeswarnings/diagnoses on potential failures and locations
and suggestions on maintenance locations and schedules."Department of Energy
DOE-0121-2023DOENational Energy Technology LaboratoryTo implement sensor-driven deep learning/artificial intelligence for power plant monitoringSensor-driven deep learning/artificial intelligence for intelligent health
monitoring capabilities that occur at the sensor (embedded computing)
or base station (edge computing). Will give power plant operators more
prediction tools about scheduling maintenance. Focus is on a high-
priority in-situ boiler temperature measurement system that relies on
chipless RFID technology and much-needed temperaturepressure
environmentaland water quality industrial sensors."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0122-2023DOENational Energy Technology LaboratoryTo implement unsupervised learning based interaction force model for nonspherical particles in incompressible flowsDevelop a neural network-based interaction (drag and lifting) force
model. A database will be constructed of the interaction force between
the non-spherical particles and the fluid phase based on the particle-
resolved direct numerical simulation (PR-DNS) with immersed boundary-
based lattice Boltzmann method (IB-LBM). An unsupervised learning
methodi.e.variational auto-encoder (VAE)will be used to improve the
diversity of the non-spherical particle library and to extract the primitive
shape factors determining the drag and lifting forces. The interaction
force model will be trained and validated with a simple but effective multi-
layer feed-forward neural network: multi-layer perceptron (MLP)which
will be concatenated after the encoder of the previously trained VAE for
geometry feature extraction."Artificial Intelligence, Neural NetworksDepartment of Energy
DOE-0123-2023DOENational Energy Technology LaboratoryTo improve control of hybrid SOFC- gas turbine power systems.Machine learning algorithms are being developed and compared to
other control methods for SOFC-gas turbine hybrid power generation
systems."OtherDepartment of Energy
DOE-0124-2023DOENational Energy Technology LaboratoryTo leverage disparate data to update assessments, analytics, and infromation for NATCARB and CS AtlasML Is utilized to parse and generate additional data and information that
can be parsed and labeled to provide additional inputs for geologic
carbon storgae assessments from multiple sources."OtherDepartment of Energy
DOE-0125-2023DOENational Energy Technology LaboratoryTo leverage machine learning and predictive analytics to advance the state of the art in pipline infrastructure integrity management.The purpose of this project is to leverage advances in machine learning
and predictive analytics to advance the state of the art in pipeline
infrastructure integrity management using forecasted (predicted)
pipeline conditionusing large sets of pipeline integrity data (periodic
nondestructive inspectionNDI) and continuous operational data (e.g.
sensor data used to monitor flow rate and temperature) generated by oil
and gas (O&G) transmission pipeline operators."Artificial Intelligence UnknownDepartment of Energy
DOE-0126-2023DOENational Energy Technology LaboratoryTo leverage ML models to increase the size and complexity of problems that can be optimized within IDAES.The objective is to leverage ML models as surrogates for complex unit
operations or to bridge between scales to increase the size and
complexity of models that can be optimized within IDAES."OtherDepartment of Energy
DOE-0127-2023DOENational Energy Technology LaboratoryTo perform reconstruction of the 3D temperature field using Neural Networks with measured and known propagation paths.The sensor will first be tested up to 300 C. For high-temperature tests,
the Recipient will use Alstom’s Industrial Size Burner Test Facility (ISBF)
or another appropriate facility. The high-temperature sensor will be first
tested from room temperature to 1800 C. The results will be
compared with data obtained using other methods such as surface
acoustic wave (SAW)thermocouplesand optical fiber sensors. A 3D
temperature mapping will be created by fusing the high-temperature
sensor data. The Recipient will test the system’s survivability in a boiler
environment. A high-temperature sensing array will be tested to map the
temperature distribution within an exhaust pipe. The sensor array will be
tested at one 6’’ port or a similar location. The Recipient will also
perform reconstruction of the 3D temperature field using Neural
Networks with measured and known propagation paths."Artificial Intelligence UnknownDepartment of Energy
DOE-0128-2023DOENational Energy Technology LaboratoryTo provide an effective quality assurance method for additively manufactured gasThe primary goal of this project is to develop a cost-effective quality
assurance (QA) method that can rapidly qualify laser powder bed fusion
(LPBF) processed hot gas path turbine components (HGPTCs) through
a machine learning framework which would assimilate in-situ monitoring
and measurementex-situ characterizationand simulation data. The
project technical deliverable will be a rapid QA tool capable of: i) building
a metadata package of process-structure-property data and models
intended for LPBF-processed HGPTCs by mining both simulation and in-
situ/ex-situ characterization data; and ii) qualifying online/offline a
manufactured component by inputting simulation with/without in-situ
monitoring data to the developed algorithms to predict porosity and
fatigue properties. The target application of this QA tool will be
advanced HGPTC produced by LPBF in Inconel 718. Data mining
techniques will be developed to consolidate and analyze the
heterogeneous big data stemmed from the aforementioned methods of
upfront simulationonline monitoring and post-build characterizationand
thus enabling a collaborative learning about the process-microstructure-
properties relationship. The resultant QA package includes a process-
structure-property database and machine learning tools for using LPBF
metal AM to fabricate HGPTC. The developed metadata package
enables online/offline qualification of additively manufactured turbine
components by inputting simulation with/without in-situ monitoring data
to the developed machine learning algorithms to predict porosity and
fatigue properties."Artificial Intelligence UnknownDepartment of Energy
DOE-0129-2023DOENational Energy Technology LaboratoryTo provide combustion performance and emissions optimization through integration of a miniaturized high- temperature multi process monitoring systemProject will develop control logic for automated control of lignite coal-
fired boiler. Plant operational data will be compared against monitoring
data to determine when different sensor output from a miniaturized high
temperature multi-processhigh-spatial-resolution monitoring system
signifies damaging conditions in that region of the boilerand what
operational changes can be made to eliminate the damaging condition.
The control logic will be developed for automated control of soot-blowing
and other boiler operations"Department of Energy
DOE-0130-2023DOENational Energy Technology LaboratoryTo provide insights into opportunities to beneficiate and use hydrocarbon infrastructure for alternative uses such as offshore carbon storage.Multiple big data-driven AI/ML models will be used to evaluate geologic,
geospatialand infrastructure related information to inform predictions
using natural language processingArtificial Neural Networksand
possibly bayesian networks as well."Big Data, OtherDepartment of Energy
DOE-0131-2023DOENational Energy Technology LaboratoryTo provide integrated boiler management through advanced condition monitoring and component assessment.The Integrated Creep-Fatigue Management System represents an
online boiler damage monitoring system applicable to creep and fatigue.
The system will be configured to allow connectivity to the plant data
historian (e.g.OSISoft:PI) and to commercial finite element software
(e.g.ANSYS and Abaqus). In addition to configuring interaction with
finite element softwareexisting damage mechanism monitoring
modules will also be deployed using online analytical calculations. This
functionality will be applied to terminal tubes entering the boiler header
for which the combined mechanisms of creep and oxidation can be
calculated without the need for a finite element analysis."Department of Energy
DOE-0132-2023DOENational Energy Technology LaboratoryTo provide natural gas leak detection and quality controlEmploying machine learning techniques to train sensing systems to
quantify the concentration of natural gas speciesdistinguish between
natural gas at different parts of the processing pipelineand distinguish
natural gas from natural and man-made interfering sources such as
wetlands and agriculture."Artificial Intelligence UnknownDepartment of Energy
DOE-0133-2023DOENational Energy Technology LaboratoryTo realize next generation solid-state power substation.The objective of the proposed project is to realize next generation solid-
state power substation (SSPS) incorporating machine learningcyber-
physical anomaly detectionand multi-agent distributed networked
control. The project will have the following capabilities: distributed control
and coordination coupled with localized intelligence and sensing
autonomous control for plug-and-playautomatic reconfiguration
recoveryand restoration enabling decoupledasynchronousand fractal
systems."Artificial Intelligence UnknownDepartment of Energy
DOE-0134-2023DOENational Energy Technology LaboratoryTo research and develop physics- aware and AI-enabled cyber- physical intrusion response for the power grid.Responding to anomalous cyber and physical events in a timely manner
requires fusing data from both cyber and physical sensors into
actionable information. Thuscyber-physical intrusion response research
will be conducted that leverages cyber and physical side data and
models with artificial intelligence (AI) as a scalable approach to maintain
or regain power system resilience under anomalous incidents such as
cyber threats."Artificial Intelligence UnknownDepartment of Energy
DOE-0135-2023DOENational Energy Technology LaboratoryTo use advanced machine learning techniques to analyze static and dynamic measurements of proppant distribution and fracture geometry data.The project will use advanced ML techniques to analyze static and
dynamic measurements of proppant distribution and fracture geometry
data from thousands of microchips injected with proppant near the
wellbore."Artificial Intelligence UnknownDepartment of Energy
DOE-0136-2023DOENational Energy Technology LaboratoryTo use AI to calibrate the simulation model by matching simulation data with production history data.Task 2 - Together with GEM, CMG’s intelligent optimization and analysis
toolCMOST Artificial Intelligence (AI)will be used to calibrate the
simulation model by matching simulation results with production history
data. . Based on the data setsa series of simulation cases will be
generated to perform parameter estimation using a systematic
approach. As simulation jobs completethe results will be analyzed
using CMOST AI to determine how well they match production history.
An optimizer will then determine parameter values for new simulation
jobs."Artificial Intelligence UnknownDepartment of Energy
DOE-0137-2023DOENational Energy Technology LaboratoryTo use computational tools to optimize the design of solid CO2 sorbents.The objective of this project is to use computational tools to optimize the
design of solid CO2 sorbents based on functionalized PIM-1 (or other
porousglassy polymers) impregnated with molecular primary amines.
The expected outcome of this project is to informvia computational
methodswhich polymer structure and which molecular amines can lead
to a solid sorbent in which CO2 loading capacityCO2 heat of
adsorptionand overall CO2 mass transfer rate are optimal at extremely
low CO2 partial pressures while amine leaching has been minimized."OtherDepartment of Energy
DOE-0138-2023DOENational Energy Technology LaboratoryTo use data analytics and machine learning techniques to advance understanding of the characteristics of the Emerging Paradox Oil PlayUsing data analytics and machine learning techniques to advance
understanding of the characteristics of the entire Parardox oil play
through integration of geologic and log-derived “electrofacies” models
and upscaling to 3D seismic data and propagation through the seismic
volume."Artificial Intelligence, Big Data, Neural NetworksDepartment of Energy
DOE-0139-2023DOENational Energy Technology LaboratoryTo use ML to help identify promising oxygen carrier materials.A combination of experimental data and computational results will be
used both to understand O2 production and to develop a machine
learning model that can be used to identify promising carrier
compositions. These compositions will be evaluated on two primary
criteriaperformance and ability to be synthesized. Once the model has
identified promising candidatesthese materials will be synthesized and
compared to existing carriers. This new data will then be used to refine
the models."OtherDepartment of Energy
DOE-0140-2023DOENational Energy Technology LaboratoryTo verify and validate testing of advanced power generation technologiesVerification and validation testing with direct support and collaboration
from operating power plants with advanced power generation
technologies and prime mover and downstream systems using near-
real-time dataresulting in better informed plant operatorsand reduced
disruptionswhile meeting changing service demands based on
enhanced operating flexibility"Artificial Intelligence, Big DataDepartment of Energy
DOE-0141-2023DOENational Energy Technology LaboratoryTransform reservoir management decisions through rapid analysis of real time data to visualize forecasted behavior in an advanced control room "human-in-the-loop" format.Improve low-fidelity model performance by transfer-learning with high-
fidelity dataand reduce uncertainty by combining high-fidelity and lower-
fidelity models for improved UQ performance."OtherDepartment of Energy
DOE-0142-2023DOENational Energy Technology LaboratoryUNET and other approaches for ML- based inversionResearchers will develop a design basis for risk-based monitoring
considering data dimensionalityuncertaintyand inter-tool/module
connectivityand define the components of the monitoring design
optimization tool (DREAM) to be incorporated into NRAP-Open-IAM and
the SMART platform."Artificial Intelligence, OtherDepartment of Energy
DOE-0143-2023DOENational Energy Technology LaboratoryUse AI to process large sensor datasets for identification and classification of NG pipeline conditions and methane leaksFocused on development of advanced data analytic techniques and
methods for distributed OFS technologyincluding AI and MLfor
identification of signatures and patterns representative of hazards
defectsand operational parameters of the natural gas pipeline network."Big Data, OtherDepartment of Energy
DOE-0144-2023DOENational Energy Technology LaboratoryUse ML to analyze the existing H2 and natural gas pipelines to identify the key parameters that can enable the H2 transport and storage at a large scaleThis task aims to use geo-data science methods and geospatial
information science to analyze the existing H2 and natural gas pipelines
to identify the key parameters that can enable the H2 transport and
storage at a large scale. The results can help to justify the importance of
real-time pipeline monitoring and recommend optimized sensor
deployment strategies to support smart maintenance and methane
emissions reduction goals."Big Data, OtherDepartment of Energy
DOE-0145-2023DOENational Energy Technology LaboratoryUse ML to enable a geophysical monitoring toolkit, and assimilate real-time modeling and data.ML-enabled rapid and autonomous geophysical monitoring and real-
time modeling and data assimilation tools (along with visualization and
decision-support frameworks)work together to radically improve
pressure and stress imaging."OtherDepartment of Energy
DOE-0146-2023DOENational Energy Technology LaboratoryUse ML to reduce high-fidelity physical models to a fast calculation that requires minimal effort to initiate.The platform will combine an intuitive user interface and visualization
capabilities from gaming software with the speed and enhanced detail in
evaluating reservoir dynamics and processes through ML /reduced
order model approaches. Advancements made with ML will alleviate the
need for both the expert user and the computational infrastructure and
make understanding subsurface fluid flow accessible to the everyday
user with a moderate level of understanding of the physics of the
system. ML will allow the experts to reduce the high-fidelity physical
models to a fast calculation that requires a minimal amount of effort to
initiatebut allows a user to investigate their own scenarios without the
need for predetermined models. Application of the platform will rapidly
enhance the experience base required for deploying and managing
commercial-scale projectsparticularly for CO2 storage projects where
field experience is limitedbecause of the anticipated intuitive translation
of subsurface dynamics in real-time."OtherDepartment of Energy
DOE-0147-2023DOENational Energy Technology LaboratoryUse of machine learning models to produce surrogates for efficient optimizationWe consider the use of machine learning models to produce surrogates
for efficient optimization. The IDAES implementation will be
demonstrated on a real-scale design problem focused on carbon
capture (e.g.rigorous MEA model)or an integrated energy system."OtherDepartment of Energy
DOE-0148-2023DOENational Energy Technology LaboratoryUsing AI to improve predcitions of subsurface properties, analyze multi- variate inputs, address knowledge and information gaps to improve predictions and modeliUse of AI methods such as fuzzy logic, neural networks, tensor flow,
and natural language processing to assist with knowledge and data
explorationtransformation and integrationas well as modeling and
analysis of multi-variate data used in the resource assessment method
to improve outputs and predictions."Artificial Intelligence, Big Data, OtherDepartment of Energy
DOE-0149-2023DOENational Energy Technology LaboratoryUsing AI/ML to replace conventional geophysics inversion - does the process quicker than the typical method. Make geophysical results more user-friendly.The project will deploy a high sensitivity atomic magnetometer
(potassium magnetometer or helium 4 magnetometer) on a sUAS
platform. Baseline surveys using the sUAS platform with the magnetic
receiver payload will be flown at the same CarbonSAFE site that
baseline ground surveys were performed in EY21. Results of the
forward modeling performed in EY20 will determine whether MT or
CSEM (or both) methods will be tested. Using AI/ML to replace
conventional geophysics inversion - does the process quicker than the
typical method. Make geophysical results more user-friendly."Neural NetworksDepartment of Energy
DOE-0150-2023DOENational Energy Technology LaboratoryUsing ML to build predictive models of branching processes and develop novel algorithms for automated MIP solver tuningWe will collect dual gaps obtained as a result of using different
branching strategies and feed them into ALAMOPysmoand other
machine learning approaches to build predictive models of branching
processes as a function of carefully chosen instance features. These
models will then be deployed as part of the IDAES platform to facilitate
optimization of advanced integrated energy systems. o Currentlytuning
MIP solvers for a particular application is approached by ad-hoc trial-and-
error methods that are tedious and often ineffectivelimiting design
engineers to solution of small problems. To address this challenge and
facilitate the solution of energy systems currently intractablewe
propose to develop novel algorithms for automated MIP solver tuning
through the use of machine learning."OtherDepartment of Energy
DOE-0151-2023DOENational Energy Technology LaboratoryUsing ML to design sensing materials which can work under harsh environments.The team proposes to develop an ML approach that relies upon
established experimental and theoretical evidence to gain a
comprehensive ML model and boost the gas sensing material design.
The essence of this approach will be to assess materials’ optimal
performance at a specific conditionsuch as temperaturepressureand
radiation levels. The development of the package will occur in several
steps: (1) building a materials database from various sources; (2) using
ML techniques to buildevaluateand optimize an ML model; (3)
predicting the temperature dependence of sensing propertiessuch as
gas selectivityfor FECM relevant gas species to screen the materials in
the material bankor proposing new sensing materials; and (4) exploring
the gas sensing mechanisms suited for high-temperature application for
those predicted most promising gas sensing materials."OtherDepartment of Energy
DOE-0152-2023DOENational Energy Technology LaboratoryUsing natural language processing to explore and extract information from historical literature/pdfsTraining and adaptation of natural lanaguage processing algorithms to
improve exploration and extraction of information from oldhistorical
scientific literature. Extraction of knowledge and dataas well as
preservation of key information."Big Data, Natural Language Processing, OtherDepartment of Energy
DOE-0153-2023DOENational Energy Technology LaboratoryUsing recursive neural networks and using fiber optic cables to recognize strain patterns and warn operators a fracture is coming.This project will develop an ML algorithm to predict the time when a
growing fracture will reach the monitored well. The ML workflow will be
trained on the distinctive tensile strain signature that precedes the
growing fracture. The new workflow will be designed to work in
conjunction with the fracture warning ML workflow developed in EY21.
Togetherthese workflows will: (1) provide an early warning of well-to-
well communication(2) predict the measured depths where the
communication will happenand (3) provide an estimated time until the
beginning of well-to-well communication."Neural Networks, OtherDepartment of Energy
DOE-0154-2023DOENational Energy Technology LaboratoryUsing time-series classification to assist in automated analysis of sensor data taken during experiments on the MHD test channel.The measurements of chemical composition will be combined with
resistance measurements to validate CFD models of the MHD channel
system. Specificallyvalidated CFD models will be able to separate the
contribution of the bulk and boundary layer resistance to the overall
resistance of the MHD channel."OtherDepartment of Energy
DOE-0155-2023DOENational Energy Technology LaboratoryWith sensor technologies and network developed, in the future, AI/ML may be used to accelerate data processing of sensor data from the sensor network.With sensor technologies and network developed, in the future, AI/ML
may be used to accelerate data processing of sensor data from the
sensor network to identify and predict risks and failures in plugged wells."Department of Energy
DOE-0156-2023DOEOffice of Environment, Health, Safety & SecurityApplications of Natural Language Processing and Similarity Measures for Similarity Ranking"EHSS has been developing applications of natural language
processing (NLP) and similarity measures for advanced information
retrieval and searching of datasets (e.g.SQL databasesCSV files
reports) as well as estimating similarities between records within a
dataset or records between different datasets. Similarity search has
been successfully applied to efficiently search DOE COVID-19 Hotline
questions and answer databasesearching DOE annual site
environmental reportssimilarity between DOE occurrence reporting and
processing system and lessons learnedand AIX data. Similarity
measures can also be used to identify opportunities for resource
prioritization and prediction.
As of October 2021the tool runs locally by the principal investigator on
project basedas requested or as a desktop application. Initial
developments were initiated to move to a web-based application but not
completed due to lack of user need and resources."""Department of Energy
DOE-0157-2023DOEOffice of Environment, Health, Safety & SecurityData Analytics and Machine Learning (DAMaL) Tools for Analysis of Environment, Safety and Health (ES&H) data: Similarity Based Information Retrieval"The EHSS Data Analytics Machine Learning (DAMaL) tools, similarity-
based information retrieval tooluses natural language processing
(NLP) and cosine similarity to leverage artificial intelligence (AI) to
increase the efficiency of a user to find important records in the DOE
environmentsafetyand health (ES&H) datasets (e.g.occurrence
reporting and processing systemfire protectionlessons learned
accident and injury reporting systemcontractor assurance system
CAS). The tool has no restriction on the text queryprovides NLP
options to the user (e.g.stemming or lemmatization) and could be used
to improve decision-making in job planning activitiesidentifying hazards
and obtaining insights from operating experience and lessons learned
data discovery and analysisaccident investigations among other areas.
As of October 2021Tool developed and deployed in the DAMaL tools
website. Expected to continue to maintaindevelop documentation
(e.g.users analysis guides)improve and enhanceand increase data
sources."Department of Energy
DOE-0158-2023DOEOffice of Environment, Health, Safety & SecurityData Analytics and Machine Learning (DAMaL) Tools to enhance the analysis of Environment, Safety and Health (ES&H) data: Classification, Robotic Process Automation and Data Visualization"The EHSS Data Analytics Machine Learning (DAMaL) tools,
classificationrobotic process automation and data visualization tool
uses natural language processing (NLP) and classification algorithms
(i.e.random forests) to automate the classification of recordsvisually
provide insights in the trends and provide an indication of importance
and risk. The tool leverages artificial intelligence (AI) to analyze the text
of the DOE environmentsafetyand health (ES&H) and operating
experience dataset records (e.g.occurrence reporting and processing
systemfire protectionlessons learnedaccident and injury reporting
systemcontractor assurance system CAS) and identifies important
topics that can be used by an analyst to drill down and further explore
potential safety issues in the DOE operations.
As of October 2021the tool has been deployed in the DAMaL tools
website. Expected to continue to maintaindevelop documentation
(e.g.users analysis guides)improve and enhanceand increase data
sources."Department of Energy
DOE-0159-2023DOEOffice of Environment, Health, Safety & SecurityData Analytics and Machine Learning (DAMaL) Tools to enhance the analysis of Environment, Safety and Health (ES&H) data: Unsupervised Machine Learning Text Clustering"The EHSS Data Analytics Machine Learning (DAMaL) tools,
unsupervised machine learning clustering tooluses natural language
processing (NLP) and clustering algorithms (i.e.k meansDBSCAN
and dimensionality reduction approaches) to leverage AI to analyze the
text of the DOE environmentsafetyand health (ES&H) and operating
experience dataset records (e.g.occurrence reporting and processing
systemfire protectionlessons learnedand accident and injury
reporting systemcontractor assurance system CAS). The tool
identifies recurrent and important topics that can be used by an analyst
to drill down and further explore potential recurrent safety issues in the
DOE operations.
As of October 2021the tool has been partially deployed in the DAMaL
tools website. Development is mostly complete with use case in Fire
Protection Trending and Analysis completed and undergoing review of
report. Expected to continue to maintaindevelop documentation (e.g.
users analysis guides)improve and enhanceand increase data
sources."Department of Energy
DOE-0160-2023DOEOffice of Environment, Health, Safety & SecurityMemorandum of Understanding Between the US DOE and US NRC on Cooperation in the Area of Operating Experience and Applications of Data Analytics (Signed June 2021)The purpose of the Memorandum of Understanding (MOU) between the
US DOE and US NRC on cooperation in the area of operating
experience and applications of data analytics (Signed June 2021) is to
efficiently use resources and to avoid needless duplication of effort by
sharing datatechnical informationlessons learnedandin some
casesthe costs related to the development of approaches and tools
whenever such cooperation and cost sharing may be done in a mutually
beneficial fashion. The technical areas for collaboration includethose
related to operating experience and safety data collection and analysis
including operational eventsoccupational injurieshazardous substance
releasesnuclear safetyradiation protectionequipment failure
accidents and accident precursorstrending analysisand risk-informed
decision-making. Applications of data analytics in the analysis of
operating experience and safety dataincluding data visualization and
analysisartificial intelligencemachine learningnatural language
processingpredictive analyticsand other advanced analysis
techniquesuser interface designand deploymentand decision-
making using data analytics tools."Department of Energy
DOE-0161-2023DOEOffice of Legacy ManagementGroundwater ModelingGroundwater modeling includes parameter estimationDepartment of Energy
DOE-0162-2023DOEOffice of Legacy ManagementSoil Moisture ModelingUse multisource machine learning to model soil moisture within the
lysimeter embedded within a disposal cell"Department of Energy
DOE-0163-2023DOEOffice of the Chief Information OfficerAI-Based Chat BotThe OCIO EITS Service Desk is exploring the ability to use AI chat bots
to interact with end-users. We are looking to have a single bot
architecture that is highly tuned to IT system languages to properly
handle the terms that may be used in an enterprise environment. The
primary benefit would be to make knowledge more available to the end-
users in a consumable manner. Additionallyit would connect to ITSM
workflows that could automate basic functions such as request an
accountprovide permissionsor create an MS Teams site as
examples. Additionallythe technology needs to provide a significant
amount of feedback to the EITS Service Desk on unanswered
questionsquestions droppedineffective responsesincorrect
responsesetc."Department of Energy
DOE-0164-2023DOEPacific Northwest National LaboratoryAdaptive Cyber-Physical Resilience for Building Control SystemsDeep learning models are used for predicting the operation of building
energy systemsand detecting and diagnosing the health state or cyber
attack presenceand for optimizing the building energy system
response to provide resilient operation and sustained energy efficiency."Department of Energy
DOE-0165-2023DOEPacific Northwest National LaboratoryAdvancing Market-Ready Building Energy Management by Cost- Effective Differentiable Predictive ControlAn AI based differentiable programming framework for domain aware
data efficient predictive modeling and AI based control policy synthesis
as well as methods for safety verification and online learning. Domain
aware deep learning models are used for learning and predicting the
response of building systems and components and for optimizing the
building energy system response to provide resilient operation and
sustained energy efficiency."Department of Energy
DOE-0166-2023DOEPacific Northwest National LaboratoryAI techniques for identification of suitable delivery parking spaces in an urban scenarioWe are using AI (Graph Neural Network) to determine importance of
parking spaces in a city network for curb management to promote
adoption of electric vehicles for freight delivery"Department of Energy
DOE-0167-2023DOEPacific Northwest National LaboratoryAI used for predictive modeling and real time control of traffic systemsDomain aware deep learning models are used for predictive modeling of
traffic. Deep learning based predictive controllers are trained from
simulated data to optimize the traffic signaling and coordination for
improved traffic flow and reduced energy consumption and GHG
emissions"Department of Energy
DOE-0168-2023DOEPacific Northwest National LaboratoryAPT AnalyticsDevelopment of AI/ML for automated analysis of APT data.Department of Energy
DOE-0169-2023DOEPacific Northwest National LaboratoryElucidating Genetic and Environmental Risk Factors for Antipsychotic-induced Metabolic Adverse Effects Using AIDevelop AI methids to find phenotypes that capture complex interation
between human genomechronic diseases and a drug's chemical
signature to predict adverse side-effects of a mental health drug on
human population"Department of Energy
DOE-0170-2023DOEPacific Northwest National LaboratoryLaboratory AutomationEmploying machine learning to identify regions of interest in SEM and
TEM data. Automating data acquisition to improve efficiencies."Department of Energy
DOE-0171-2023DOEPacific Northwest National LaboratoryManaging curb allocation in citiesThis project's goal is to develop a city-scale dynamic curb use
simulation tool and an open-source curb management platform that
address the challenge of increased demand for curb-side parking."Department of Energy
DOE-0172-2023DOEPacific Northwest National LaboratoryPhysics-Informed Learning Machines for Multiscale and Multiphysics Problems (PhILMs)PhILMs investigators are developing physics-informed learning
machines by encoding physics knowledge into deep learning networks"Department of Energy
DOE-0173-2023DOEPacific Northwest National LaboratoryRegional waste feedstock conversion to biofuelsUnsupervised ML is used sequentially to group waste sources into
different regions. Calibrated game theoretic models are used to assess
the behavior and economic viability of different waste-to-energy
pathways within a region."Department of Energy
DOE-0174-2023DOEPacific Northwest National LaboratoryScalable, Efficient and Accelerated Causal Reasoning Operators, Graphs and Spikes for Earth and Embedded Systems (SEA-CROGS)Establish a center for scalable and efficient physics-informed machine
learning for science and engineering that will accelerate modeling
inferencecausal reasoningetiology and pathway discovery for earth
systems and embedded systems. Advances will lead to a higher level of
abstraction of operator regression to be implemented in next generation
neuromorphic computers."Department of Energy
DOE-0175-2023DOEPacific Northwest National LaboratorySurrogate models for probabilistic Bayesian inferenceWe are using AI/ML to build surrogate models of the observable
response of complex physical systems. These surrogate models will be
used for probabilistic model inversion of these systems with the goal of
estimating unknown model parameters from indirect observations."Department of Energy
DOE-0176-2023DOEThomas Jefferson LaboratoryUniversal MCEGR&D on ML based MC event generator that serves as data
compatification utility."Department of Energy
DOE-0177-2023DOEWestern Area Power AdministrationFIMS - Invoice BOT - Employee Reimbursements FIMS - Invoice BOT - Purchase PowerPROCESS - Invoices are sent to the RPA Invoice Intake email box
(RPAInvoiceIntake@WAPA.GOV). Once a dayunattended bot will
extract information from PDF invoices. The invoice is classified to
determine whether the invoice is an Employee Reimbursement or a
Purchase Power Invoice. The information extracted from the invoice is
then review/validated by the Accounts Payable Technician. After
validationthe bot will load the information into the WAPA Financial
Management System."Operation and MaintenanceArtificial Intelligence, Document UnderstandingDepartment of Energy
DOI-0000-2023DOIBLMLand Use Plan Document and Data Mining and Analysis R&DExploring the potential to identify patterns, rule alignment or conflicts, discovery, and mapping of geo history and/or rules. Inputs included unstructured planning documents. Outputs identify conflicts in resource management planning rules with proposed action locations requiring exclusion, restrictions, or stipluations as defined in the planning documents.Planned (not in production)Natural Language Processing and Geo ClassificationDepartment of Interior
DOI-0001-2023DOIBORSeasonal/Temporary Wetland/Floodplain Delineation using Remote Sensing and Deep LearningReclamation was interested in determining if recent advancements in machine learning, specifically convolutional neural network architecture in deep learning, can provide improved seasonal/temporary wetland/floodplain delineation (mapping) when high temporal and spatial resolution remote sensing data is available? If so, then these new mappings could inform the management of protected species and provide critical information to decision-makers during scenario analysis for operations and planning.CompletedImage classification using Joint Unsupervised Learning (JULE)Department of Interior
DOI-0002-2023DOIBORData Driven Sub-Seasonal Forecasting of Temperature and PrecipitationReclamation has run 2, year-long prize competitions where particants developed and deployed data driven methods for sub-seasonal (2-6 weeks into future) prediction of temperature and precipitation across the western US. Particpants outperformed benchmark forecasts from NOAA. Reclamation is currently working with Scripps Institute of Oceanography to further refine, evaluate, and pilot implement the most promising methods from these two copmetitions. Improving sub-seasonal forecasts has significant potential to enhance water management outcomes.Development (not in production)Range of data driven, AI/ML techniques (e.g. random forests)Department of Interior
DOI-0003-2023DOIBORData Driven Streamflow ForecastingReclamation, along with partners from the CEATI hydropower industry group (e.g. TVA, DOE-PNNL, and others) ran a year-long evaluation of existing 10-day streamflow foreasting technologies and a companion prize competition open to the public, also focused on 10-day streamflow forecasts. Forecasts were issued every day for a year and verified agains observed flows. Across locations and metrics, the top perfoming foreacst product was a private, AI/ML forecasting company - UpstreamTech. Several competitors from the prize competition also performed strongly; outperforming benchmark forecasts from NOAA. Reclamation is working to further evaluate the UpstreamTech forecast products and also the top performers from the prize competition.Development (not in production)Range of data driven, AI/ML techniques (e.g. LSTMs)Department of Interior
DOI-0004-2023DOIBORSnowcast ShowdownReclamation partnered with Bonneville Power Administration, NASA - Goddard Space Flight Center, U.S. Army Corps of Engineers, USDA - Natural Resources Conservation Service, U.S. Geological Survey, National Center for Atmospheric Research, DrivenData, HeroX, Ensemble, and NASA Tournament Lab to run the Snowcast Showdown Prize Competition. In this competition, particiapnts were asked to develop mehtods to estimate disrributed snow information by blending observations from different sources using machine learning methods that provide flexible and efficient algorithms for data-driven models and real-time prediction/esimation. Winning methods are now being evaluated and folded into a follow-on project with NOAA's River Forecast Centers.Development and AcquisitionRange of data driven, AI/ML techniquesDepartment of Interior
DOI-0005-2023DOIBORPyForecastPyforecast is a statistical/ML water supply forecasting software developed by Reclamation that uses a range of data-driven methods.ImplementationRegression and related methodshttps://github.com/usbr/PyForecastDepartment of Interior
DOI-0006-2023DOIBORImproved Processing and Analysis of Test and Operating Data from Rotating MachinesThis project is exploring a better method to analyze DC ramp test data from rotating machines. Previous DC ramp test analysis requires engineering expertise to recognize characteristic curves from DC ramp test plots. DC ramp tests produce a plot of voltage vs current for a ramping voltage applied to a rotating machine. By using machine learning/AI tools, such as linear regression, the ramp test plots can be analyzed by computer software, rather than manual engineering analysis, to recognize characteristic curves. The anticipated result will be faster and more reliable analysis of field-performed DC ramp testing.Investigating/Proof of conceptDepartment of Interior
DOI-0007-2023DOIBORImproving UAS-derived photogrammetric data and analysis accuracy and confidence for high-resolution data sets using artificial intelligence and machine learningUAS derived photogrammetric products contain a large amount of potential information that can be less accurate than required for analysis and time consuming to analyze manually. By formulating a standard reference protocol and applying machine learning/artificial intelligence, this information will be unlocked to provide detailed analysis of Reclamation's assets for better informed decision making.Proof-of-concept completedDepartment of Interior
DOI-0008-2023DOIBORPhotogrammetric Data Set Crack Mapping Technology SearchThis project is exploring a specific application of photogrammetric products to process analysis of crack mapping on Reclamation facilites. This analysis is time consuming and has typically required rope access or other means to photograph and locate areas that can now be reached with drones or other devices. By formulating a standard reference protocol and applying machine learning/AI, this information will be used to provide detailed analysis of Reclamation assets for better decision making.Proof-of-concept completedDepartment of Interior
DOI-0009-2023DOIBSEESustained Casing Pressure IdentificationWell casing pressure requests are submitted to BSEE to determine whether a well platform is experiencing a sustained casing pressure (SCP) problem. SCP is usually caused by gas migration from a high-pressured subsurface formation through the leaking cement sheath in one of the well’s casing annuli, but SCP can also be caused by defects in tube connections, downhole accessories, or seals. Because SCP can lead to major safety issues, quickly identifying wells with SCP could greatly mitigate accidents on the well platforms. BSEE entered into an Inter-Agency Agreement with NASA's Advanced Supercomputing Division to help research the use of various AI techniques.Development (not in production)Machine learning via deep learning models, such as a Residual Neural Network (ResNet) and Convolutional Neural Networks (CNN)Department of Interior
DOI-0010-2023DOIBSEEWell Activity Report ClassificationResearching the use of self-supervised and supervised deep neural networks to identify classification systems for significant well event using data from well Activity ReportsDevelopment (not in production)Natural language processing (NLP) along with supervised and self-supervised machine learning via deep learning models, such as a Residual Neural Network (ResNet) and Convolutional Neural Networks (CNN).Department of Interior
DOI-0011-2023DOIBSEEWell RiskNASA's Advanced Supercomputer Division will utilize the work performed in the sustained casing pressure research to explore the development of machine learning models to identify various precursors of risk factors for wells. By identifying these risk factors it would help inform BSEE engineers of potential problems with the well during its various stages of development.Development (not in production)Machine learning via deep learning model, such as a Residual Neural Network (ResNet) and Convolutional Neural Networks (CNN)Department of Interior
DOI-0012-2023DOIBSEEAutonomous Drone InspectionsBSEE is exploring the potential development of autonomous systems in drones to detect methane and inspect unsafe to board platforms on the outer continental shelf. Using autonomous drones will allow some inspection capabilities to be performed while maintaining the safety of inspectors without requiring extensive training to operate the drones.Development (not in production)Unknown at this timeDepartment of Interior
DOI-0013-2023DOIBSEELevel 1 Report Corrosion Level ClassificationLevel 1 surveys obtained from BSEE report the condition of well platforms. The reports include images of well platform components, which can be used to estimate coating condition and structural condition, important factors in the overall condition of the facility. The reports are used to assess the well platforms for safety concerns. The reports are submitted to BSEE and are manually reviewed to determine whether a well platform needs additional audits. Because the manual review process is time-consuming, an automated screening system that can identify parts of the wells that exhibit excess corrosion may greatly reduce report processing time. BSEE entered into an Inter-Agency Agreement with NASA's Advanced Supercomputing Division to help research the use of various AI techniques.ImplementationMachine learning via deep learning models, such as a Residual Neural Network (ResNet) and Convolutional Neural Networks (CNN)Department of Interior
DOI-0014-2023DOIUSGSDO NOT USE (21st Century IMT Applicaitons Analysis AIML)Activity: Implement artificial intelligence (AI) and machine learning (ML) cloud services including SageMaker and Rekognition . Outcome/Value: Provide Cloud-based tools and services that present options to pursue investigations using machine learning or artificial intelligence-based approaches. These are critical capabilities to support predictive science and enabling the movement toward actionable intelligence.Development and AcquisitionConvolutional Neural NetworksDepartment of Interior
DOI-0015-2023DOIUSGSData Mining, Machine Learning and the IHS Markit DatabasesSupport the current DOI Secretarial Priority Project (Smart Energy Development) focused on the issue of identifying areas of potential conflict between energy development and alternative priorities, through the application of machine learning techniques to extract spatial patterns related to future development._x000D_
_x000D_
Lay the groundwork for the addition of new sets of skillsnew types of analysesand new products for the ERP and for the Mission Area; build internal knowledge about what machine learning can do for the ERP."ImplementationRandom Forest Regression, XGBoostDepartment of Interior
DOI-0016-2023DOIUSGSAluminum Criteria Development in MassachusettsThe USGS, in cooperation with MassDEP, will collect water-quality data at freshwater sites in Massachusetts, and use those data to demonstrate a process for calculating aluminum criteria based on a sites water chemistry (pH, DOC, and hardness) using a multiple linear regression model developed by the EPA (2017).ImplementationRandom Forest Classification and RegressionDepartment of Interior
DOI-0017-2023DOIUSGSMulti-scale modeling for ecosystem service economicsWork continues to expand the existing ARIES modeling framework using artificial intelligence and a set of decision rules to build a system that can select models and data based on appropriate contextual factors (e.g., climate, vegetation, soils, socioeconomics). Using national and global datasets, this system will be capable of mapping ES at a much greater level of accuracy than before. I will work to expand and implement this intelligent modeling system to the United States, yielding a consistent, nationwide, AI-supported intelligent ES modeling system to support ES assessment and valuation nationwide and beyond. This includes the integration of national economic accounts data with ecosystem services data to provide more timely, up to date, and integrated data at the national and subnational levels.ImplementationNeural network regressionDepartment of Interior
DOI-0018-2023DOIUSGSTwitchell Rice AFRIA large, interdisciplinary study (led by UC Davis in collaboration with UC Berkeley, the USGS and several private consultants) will be investigating the effects (subsidence, gas flux and water quality) of converting acreage on Twitchell Island, a deeply subsided island in the Sacramento-San Joaquin Delta, from drained row crops to flooded rice production. The USGS research objective is to assess water quality effects with respect to MeHg production under different rice management practices including tillage, flooding and fertilization quantifying the relative methylation potential of each practice.Implementationconvolutional neural networksDepartment of Interior
DOI-0019-2023DOIUSGSWOS.OS.NHM National Temperature ObservationsThe objectives of this project are to reduce the burden on Science Centers for the collection, storage, analysis, and processing of quality assurance data with the expectation this will lead to an increase of deployed sensors in the water temperature network. More specifically the project will (1) modify software to allow for processing and storage of discrete water temperature data collected during streamflow measurements, (2) implement workflows and QA checks in data collection software that supports new temperature policies and procedures (3) create a pilot program to support Science Centers in accomplishing 5-pt temperature checks.Initiationconvolutional neural networksDepartment of Interior
DOI-0020-2023DOIUSGSWRA.HIHR.WAIEE Building capacity for assessment and prediction of post-wildfire water availabilityAll listed objectives are focused on the western US:
· Collect multiple harmonized datasets from fire-affected basins in the western US that will advance developmentcalibrationand validation of water-quality models and assessment.
· Analyze harmonized datasets to assess regional differences in critical drivers of water quality impairment.
· Develop decision tree and standardized plan to determine locations to monitor after wildfire and ensure consistent post-fire water-quality data collection that accurately captures magnitude and duration of impairment.
· Develop rapid response plan to enable WSCs and WMA to be prepared for immediate responses for post-fire data collection and assessment.
· Establish the state of the science of critical drivers of post-fire water quality impairment in different ecoregions the western U.S.
· Characterize critical driversincluding in-stream and reservoir-sediment interface contributionsto post-fire water quality impairment.
· Build catalog of methods for measuring remotely sensed water quality after wildfire and apply multiple test cases of application.
· Develop catalog of critical data needs for geospatial prediction of wildfire impacts on water.
· Construct blueprint for incorporating missing critical water-quality impairment processes into modeling and prediction.
· Prepare plan with IWP for incorporating wildfire effects on water availability into rapid prediction.
· Participate in development and application of a framework for cross-Mission Area integration of predictive approaches spanning temporal and spatial scales for post-fire hazards."Initiationconvolutional neural networksDepartment of Interior
DOI-0021-2023DOIUSGSWRA.NWC.WU Gap analysis for water useThe USGS Water Use Program requires a formal and detailed gap analysis of water-use data for the nation in order to better understand uncertainty in water-use estimates and to help inform future data collection and modeling efforts.  The primary objectives of this project are to: 1) identify the dominant water-use categories in different areas of the U.S.; 2) identify gaps in the available data for those categories, primarily gaps in data that if filled will improve model performance; and 3) identify potential methods for data estimation that can be used to fill gaps and provide the most benefit to water-use modeling efforts.  Other objectives include:  1) increasing understanding of data quality to help inform uncertainty in model predictions; 2) collaboration with model developers to understand water-use model sensitivity to input data in order to focus and prioritize future data collection; and 3) improved quality of data related to the extraction, delivery, and consumptive use of water for the important water use categories in different regions.  Water-use categories include public supply, domestic, industrial, thermoelectric power, irrigation, livestock, and aquaculture.  National models currently are under development for public supply, irrigation and thermoelectric. Initiationconvolutional neural networksDepartment of Interior
DOI-0022-2023DOIUSGSWRA.NWC.IWAA National Extent Hydrogeologic Framework for NWCThe primary objectives of this project are to (1) provide Nationally consistent predictions of groundwater quality (salinity and nutrients) relevant for human and ecological uses and its influence on surface-water, and (2) develop strategies for integrating these predictions into comprehensive water-availability assessments including the National Water Census and regional Integrated Water Availability Assessments. These primary objectives are organized by task as follows:
Task 1: Groundwater-Quality Prediction – salinity
· Provide accurate and reliable predictions of groundwater salinity at appropriate resolutions to document groundwater availability for human and ecological uses.
Task 2: Groundwater-Quality Prediction – nutrients
· Provide accurate and reliable predictions of nutrient concentrations in groundwater at appropriate resolutions to document groundwater availability for human and ecological uses.
Task 3: Incorporate Groundwater-Quality Predictions into Comprehensive Assessments of Water Availability
· Develop and refine strategies for coupling predictions of groundwater quality with groundwater flow and flux simulations from process-based models (e.g.GSFLOWGeneral Simulation Models) to quantify the amount of groundwater of a specified quality that is available and to better determine the affect of groundwater on surface-water quantity and quality."Initiationconvolutional neural networksDepartment of Interior
DOI-0023-2023DOIUSGSWRA.NWC.IWAA National-Extent Groundwater Quality Prediction for the National Water Census and Regional Integrated Water Availability AssessmentsThe primary objectives of this project are to (1) provide Nationally consistent predictions of groundwater quality (salinity and nutrients) relevant for human and ecological uses and its influence on surface-water, and (2) develop strategies for integrating these predictions into comprehensive water-availability assessments including the National Water Census and regional Integrated Water Availability Assessments.Initiationconvolutional neural networksDepartment of Interior
DOI-0024-2023DOIUSGSWRA.HIHR.WQP Process-guided Deep Learning for Predicting Dissolved Oxygen on Stream NetworksThe objective of this project is to build a model that predicts daily minimum, mean, and maximum stream DO levels on stream segments in the Lower Delaware River Basin using nationally available datasets.InitiationRandom Forest ClassificationDepartment of Interior
DOI-0025-2023DOIUSGSWRA.NWC.EF Economic Valuation of Ecosystem Services in the Delaware River BasinThe objectives of this project are to: _x000D_
_x000D_
Create a data and model inventory plan to evaluate existing data and models. _x000D_
_x000D_
Develop a database for the existing fish data. _x000D_
_x000D_
Develop Artificial Intelligence/Machine Learning (AI/ML) models to predict fish abundances and size under alternate future climates and reservoir operations. _x000D_
_x000D_
Develop models for economic valuation of the fishery resource. _x000D_
_x000D_
Evaluate the validity of estimated economic models against alternative approaches. _x000D_
_x000D_
Link models together to allow evaluation of tradeoffs between water use and the fisheries resource. _x000D_
_x000D_
Provide a prototype web application with re-usable components for internal USGS use that promotes understanding of the models and allows assessment of resource tradeoffs. _x000D_
_x000D_
,Initiation,Random forest regressionrandom forest classification"Department of Interior
DOI-0026-2023DOIUSGSWRA.NWC.IWAA Model Application for the National IWAAs and NWCIn support of both the periodic National Water Availability Assessment reports and the routinely updated National Water Census, the Model Application for the National IWAAs and NWC (MAPPNAT) project will have four major objectives related to model application development: 1) Provide initial applications of models for the National IWAAs reports and the National Water Census, 2) Provide periodic long-term projections for the National IWAAs reports and the National Water Census, 3) Provide routine model updates of current or near-current conditions for the National IWAAs reports and the National Water Census, and 4) Provide operational short-term forecasts for the National Water Census. These four objectives will ultimately cover multiple hydrologic sub-disciplines—including water budgets, water use, water quality, aquatic ecosystems, and drought. Objective 1 will require a combination of on-project and off-project modeling activities to provide the needed model applications for National IWAAs and NWC version 1. Objectives 2, 3, and 4 will begin with strategic planning activities before implementation using the available model applications. As new models are developed, the staffing, organization, and approach for this project will be developed in an integrated manner that can accommodate multiple sub-disciplines and differing domain expertise.InitiationRandom forest regression, random forest classification, random survival forests, neural networks, long-short term memory, recurrent neural networksDepartment of Interior
DOI-0027-2023DOIUSGSWRA.WPID.IWP.PUMP Turbidity ForecastingThis project aims to advance the use of national hydrological forecast models for delivering water quality forecasts relevant to water resource managers.InitiationConvolutional Neural NetworksDepartment of Interior
DOI-0028-2023DOIUSGSWRA.WPID.IWP.PUMP ExaSheds stream temperature projections with process-guided deep learningThis 3-year project will improve PGDL stream temperature models by adding new forms of process guidance and merging techniques developed by USGS and DOE staff in past projects. Model assessments will emphasize robustness to projections in not-previously-seen conditions, such as those of future climates, paving the way for reliable projections into future decades in the Delaware River Basin.InitiationConvolutional Neural NetworksDepartment of Interior
DOI-0029-2023DOIUSGSVegetation and Water DynamicsMajor activities include tracking vegetation phenology as a basic input for drought monitoring and for capturing the unique phenological signatures associated with irrigated agriculture and invasive species. Drought mapping and monitoring focus on two conterminous US wide operational tools, VegDRI and QuickDRI, to inform drought severity in a timely fashion. A targeted livestock forage assessment tool is tailored to quantify drought effects in terms of livestock forage deficits in kg/ha for specific producer decision makers. High latitude systems have high carbon stocks, particularly the numerous wetlands. Understanding spatiotemporal surface water dynamics will inform of permafrost degradation and probable methane emission hot spots. Vegetation phenology signatures improve land cover class separations and capture unique phenological signatures associated with invasive species like cheatgrass. Understanding remote sensing sensitivity of phenology tracking at various spatial resolutions and varying degrees of noise associated with mixed pixel effects of other vegetation, soils, and water improves accuracy and consistency of estimations of phenology as well as derivative products tailored for specific land manager use. The determination of irrigated and non-irrigated system provides useful geospatial data for water management and can serve to isolate ecological comparisons or contrasts to either irrigated or non-irrigated land management.InitiationConvolutional Neural NetworksDepartment of Interior
DOI-0030-2023DOIUSGSDOMESTIC WELL VULNERABILITY SES INDICATORS NEW HAMPSHIREThe goals of this work are to: (1) investigate homeowner-level statistical associations between datasets on private wells (geology and land use, construction, hydraulics, and chemistry) and SES (and SES proxy) data; (2) investigate statewide census block-group level statistical associations between datasets on private wells (geology and land use, construction, hydraulics, and probabilities of arsenic and uranium contamination) and demographic and SES (and SES proxy) data; (3) identify indicators or triggers of vulnerability to private well water availability and quality in New Hampshire; and (4) broadly disseminate information from this study to scientific and general audiences, as well as to targeted community groups.InitiationConvolutional neural networksDepartment of Interior
DOI-0031-2023DOIUSGSTwo-Dimensional Detailed Hydraulic AnalysisThe USGS proposes to conduct analysis of detailed hydrology and develop a two-dimensional hydraulic model to assist in decision-making for the protection of life and property and local floodplain management and regulation. The following objectives are identified in the scope of this effort:_x000D_
Data objectives include:_x000D_
1.Topographic surveys in the study reaches to verify or augment existing topography usedin prior analyses._x000D_
a.Transportation routes_x000D_
b.Critical infrastructure_x000D_
c.Various landforms_x000D_
d.Non-structural flood mitigation recommendations at specific asset locations(USACE2019)_x000D_
Interpretive objectives include:_x000D_
1.Hydrologic analysis of the main stem of Joachim Creek (fig. 2) to produce discharge- frequency values for the 10%4%2%1%and 0.2% regulatory flood flows._x000D_
2.Develop a calibrated two-dimensional hydraulic model inclusive of the following studyreaches for the newly developed regulatory flood flows identified in interpretiveobjective (1) above:_x000D_
a.Main stem 3.5-mile reach of Joachim Creek from a location above Highway Edownstream to cross-section AI (fig. 2). The study reach is aligned with theexisting regulatory FIS effective model and FIRM bounded upstream at a mid- point location between cross-section BC and BB and downstream at cross- section AI (fig. 3a3b)._x000D_
3.Two-dimensional model simulations of 10%4%2%1%and 0.2% regulatory floodflows developed in interpretive objective (1) will produce flood profiles for the mainstem of Joachim Creek._x000D_
4.Development of two-dimensional model-derived flood maps for the main stem ofJoachim Creekwill be disseminated for the newly defined 1% and 0.2% regulatory floodflows in interpretive objective (1). Model-derived maps will illustrate inundationextentswater-surface elevationdepthand velocityincluding a published table ofcomparisons with the summarized list of spatially relevant nonstructural floodmitigation assets defined in the preliminary FMP by USACE (USACE2019)."InitiationDoodler: https://github.com/dbuscombe-usgs/dash_doodlerhttps://github.com/dbuscombe-usgs/dash_doodlerDepartment of Interior
DOI-0032-2023DOIUSGSGEMSC Geospatial Modernization and Machine Learning IntegrationThe USGS Director's office laid out a vision for the USGS for the next decade in the blog post “21st Century Science—Preparing for the Future”. A key component of this vision was outlined by stating “Over the next decade, we will take advantage of advances in sensor technologies, integrated modeling, artificial intelligence (AI), machine learning (ML), and high-performance computing to observe, understand, and project change across spatial and temporal scales in real-time and over the long term.” For GEMSC to play a role in this initiative, a multi-year project is proposed to integrate these technologies in GEMSC project workflows and data services. The overarching objective for this project is development of a strategic framework for integrating ERP science with traditional information technology related platforms.InitiationActive learning, transfer learning, deep learing, convolutional neural networks (Fastern-RCNN, YOLOv5)Department of Interior
DOI-0033-2023DOIUSGS21st Century Prospecting: AI-assisted Surveying of Critical Mineral Potential (Reimbursable)Based on the mandate to assess critical minerals distributions in the US, MRP has entered into a partnership between USGS and DARPA. The objective of this partnership is to accelerate advances in science for understanding critical minerals, assessing unknown resources, and increase mineral security for the Nation.InitiationLong short term memory (LSTM) modelsDepartment of Interior
DOI-0034-2023DOIUSGSSWFL Habitat GIS ModelObjective 1 – Update and maintain a seamless digital library of predicted flycatcher breeding habitat displayed (rendered) as binary or 5-class probability maps. This effort is ongoing. Landsat reimages the same location every 16 days. Currently, the digital library that is housed within ESRI’s AGOL library contains SWFL habitat maps from 2013 – 2022, spanning 57 Landsat scenes (see Hatten, 2016 for details) output by GEE._x000D_
Objective 2 – Update and maintain the SWFL Habitat Viewer so users can leverage and display the satellite model’s range-wide database and produce a habitat map for any stream reach in the flycatcher’s range. The web based (AGOL) application will allow one to query_x000D_
displayand download flycatcher habitat maps from 2013 to present by leveraging a library of existing habitat maps generated with GEEcreate a habitat time series for a given reachproduce a change detection map between two time periodsand produce metadata records based upon the scene’s date and digital footprint. The SWFL Habitat Viewer can also quantify or simulate beetle impacts to flycatcher habitat on a reach-by-reach basisbut simulations are dependent upon the availability of tamarisk maps. _x000D_
Objective 3 – Participate on regional workgroupssymposiaand conferences to inform potential and existing users about the SWFL Habitat Viewer. Currentlythe RiversEdge West biannual conference and NAU’s biannual Colorado Plateau Research conference are the major outlets for presentationsbut other regional conference candidates may be in ColoradoNevadaNew Mexicoor California._x000D_
Objective 4 - Collaborate in efforts to improve and extend the utility of the flycatcher satellite model by exploring cutting-edge modeling techniques (e.g.occupancy modelingclimate-wildlife modeling). For examplethe flycatcher satellite model is being used to develop a regional database that contains patch attributes of SWFL habitat across the entire range of flycatchers. Such information is invaluable for exploring the relationships between patch occupancy and neighborhood characteristics (e.g.number of patches within a given radiiage of patchesdistance between patches). The SWFL model is also being integrated into a regionwide project that focuses on linking interdisciplinary scientific data and models with artificial intelligence techniqueswith a focus on hydrologic and ecological model integration in the Colorado River Basinto better address drought and climate change."InitiationBagged trees (aka random forest) classificationDepartment of Interior
DOI-0035-2023DOIUSGSKaguya TC DTM GenerationThe primary goals for FY21 are to develop a processing pipeline for generating Kaguya TC DTMs, generate a test suite of 100 Kaguya TC DTMs using Ames Stereo Pipeline (ASP), and evaluate the resulting products.Initiationextreme gradient boosted classification, stochastic gradient decent (LinearLearner®), multi-layer perceptronDepartment of Interior
DOI-0036-2023DOIUSGSAI/ML for aquatic scienceThis project aims to develop novel computational frameworks and AI algorithms for individual_x000D_
fish recognitionby leveraging AIcomputer vision and deep learning. The main objectives of_x000D_
this project include:_x000D_
(1) Develop baseline AI models by exploiting visual features and pre-trained deep_x000D_
learning models._x000D_
(2) Improve individual fish recognition performanceas well as handling new individuals and_x000D_
exploring dynamic environments._x000D_
(3) Evaluate melanistic markings associated with “blotchy bass syndrome” to assess the_x000D_
capacity for AI detection of diseased fish._x000D_
(4) Evaluate deep learning models for individual recognition and respiration rate (ventilate_x000D_
rate) using video data collected in laboratory settings and natural streams."Initiationconvolutional neural networksDepartment of Interior
DOI-0037-2023DOIUSGSTMDL and Data Mining InvestigationsApply data-mining techniques, include artificial neural network models, the hydrologic investigations.Operation and MaintenanceDeep convolutional neural networks; ResNet, MobileNet, UNet, RetinaNethttps://github.com/dbuscombe-usgs/MLMONDAYSDepartment of Interior
DOJ-0000-2023DOJDrug Enforcement AdministrationDrug Signature Program AlgorithmsDEA's Special Testing and Research
Laboratory utilizes AI/ML techniques and
has developed a robust statistical
methodology including multi-variate
statistical analysis tools to automatically
classify the geographical region of origin
of samples selected for DEA's Heroin and
Cocaine signature programs. The system
provides for detection of anomalies and
low confidence results."In production: more than 1 yearDepartment of Justice
DOJ-0001-2023DOJFederal Bureau of InvestigationComplaint Lead Value ProbabilityThreat Intake Processing System (TIPS)
database uses artificial intelligence (AI)
algorithms to accurately identify
prioritizeand process actionable tips
in a timely manner. The AI used in this
case helps to triage immediate threats
in order to help FBI field offices and
law enforcement respond to the most
serious threats first. Based on the
algorithm scorehighest priority tips
are first in the queue for human
review."In production: more than 1 yearDepartment of Justice
DOJ-0002-2023DOJJustice Management DivisionIntelligent Records Consolidation ToolThe Office of Records Management Policy
uses an AI and Natural Language Processing
(NLP) tool to assess the similarity of records
schedules across all Department records
schedules. The tool provides clusters of
similar items to significantly reduce the time
that the Records Manager spends manually
reviewing schedules for possible
consolidation. An AI powered dashboard
provides recommendations for schedule
consolidation and reviewwhile also
providing the Records Manager with the
ability to review by cluster or by individual
record. The solution's technical approach
has applicability with other domains that
require text similarity analysis."In production: more than 1 yearDepartment of Justice
DOJ-0003-2023DOJTax DivisionPrivileged Material IdentificationThe application scans documents and
looks for attorney/client privileged
information. It does this based on
keyword input by the system
operator."In production: less than 6 monthsDepartment of Justice
DOL-0000-2023DOLForm Recognizer for Benefits FormsCustom machine learning model to extract data from complex forms to tag data entries to field headers. The input is a document or scanned image of the form and the output is a JSON response with key/value pairs extracted by running the form against the custom trained model.Operation and MaintenanceClassification machine learning model involving computer visionDepartment of Labor
DOL-0001-2023DOLLanguage TranslationLanguage translation of published documents and website using natural language processing models.ImplementationCloud based commercial-off-the-shelf pre-trained NLP modelsDepartment of Labor
DOL-0002-2023DOLAudio TranscriptionTranscription of speech to text for records keeping using natural language processing models.Operation and MaintenanceCloud based commercial-off-the-shelf pre-trained NLP modelsDepartment of Labor
DOL-0003-2023DOLText to Speech ConversionText to speech (Neural) for more realistic human sounding applications using natural language processing models.Operation and MaintenanceCloud based commercial-off-the-shelf pre-trained NLP modelsDepartment of Labor
DOL-0004-2023DOLClaims Document ProcessingTo identify if physician’s note contains causal language by training custom natural language processing models.ImplementationNatural language processing for (a) document classification and (b) sentence-level causal passage detectionDepartment of Labor
DOL-0005-2023DOLWebsite Chatbot AssistantThe chatbot helps the end user with basic information about the program, information on who to contact, or seeking petition case status.ImplementationCloud based commercial-off-the-shelf pre-trained chatbotDepartment of Labor
DOL-0006-2023DOLData Ingestion of Payroll FormsCustom machine learning model to extract data from complex forms to tag data entries to field headers. The input is a document or scanned image of the form and the output is a JSON response with key/value pairs extracted by running the form against the custom trained model.InitiationClassification machine learning model involving computer visionDepartment of Labor
DOL-0007-2023DOLHololensAI used by Inspectors to visually inspect high and unsafe areas from a safe location.Operation and MaintenanceDepartment of Labor
DOL-0008-2023DOLDOL Intranet Website Chatbot AssistantConversational chatbot on DOL intranet websites to help answer common procurement questions, as well as specific contract questions.InitiationCloud based commercial-off-the-shelf pre-trained NLP modelsDepartment of Labor
DOL-0009-2023DOLOfficial Document ValidationAI detection of mismatched addresses and garbled text in official letters sent to benefits recipients.ImplementationComputer VisionDepartment of Labor
DOL-0010-2023DOLElectronic Records ManagementMeeting NARA metadata standards for (permanent) federal documents by using AI to identify data within the document, and also using NLP to classify and summarize documents.InitiationCustom text classification machine learning modelDepartment of Labor
DOL-0011-2023DOLCall Recording AnalysisAutomatic analysis of recorded calls made to Benefits Advisors in the DOL Interactive Voice Repsonse (IVR) center.InitiationCloud based commercial-off-the-shelf pre-trained NLP modelsDepartment of Labor
DOL-0012-2023DOLAutomatic Document ProcessingAutomatic processing of continuation of benefits form to extract pre-defined selection boxes.ImplementationCloud based commercial-off-the-shelf pre-trained NLP modelsDepartment of Labor
DOL-0013-2023DOLAutomatic Data Processing Workflow with Form RecognizerAutomatic processing of current complex worflow to extract required data.InitiationClassification machine learning model involving computer visionDepartment of Labor
DOL-0014-2023DOLCase Recording summarizationUsing an open source large language model to summarize publicly available case recording documents which are void of personal identifiable information (PII) or any other sensitive information. This is not hosted in the DOL technical environment and is reviewed by human note takers.Development and AcquisitionLarge language summarization modelDepartment of Labor
DOL-0015-2023DOLOEWS Occupation AutocoderThe input is state submitted response files that include occupation title and sometimes job description of the surveyed units. The autocoder reads the job title and assigns up to two 6-digit Standard Occupational Classification (SOC) codes along with their probabilities as recommendations for human coders. Codes above a certain threshold are appended to the submitted response file and sent back to states to assist them with their SOC code assignment.Operation and MaintenanceNatural Langauge Processing, Logistic Regression, ClassificationDepartment of Labor
DOL-0016-2023DOLScanner Data Product ClassificationBLS receives bulk data from some corporations related to the cost of goods they sell and services they provide. Consumer Price Index (CPI) staff have hand-coded a segment of the items in these data into Entry Level Item (ELI) codes. To accept and make use of these bulk data transfers at scale, BLS has begun to use machine learning to label data with ELI codes. The machine learning model takes as input word frequency counts from item descriptions. Logistic regression is then used to estimate the probability of each item being classified in each ELI category based on the word frequency categorizations. The highest probability category is selected for inclusion in the data. Any selected classifications that do not meet a certain probability threshold are flagged for human review.Operation and MaintenanceNatural Langauge Processing, Logistic Regression, ClassificationDepartment of Labor
DOL-0017-2023DOLExpenditure Classification AutocoderCustom machine learning model to assign a reported expense description from Consumer Expenditure Diary Survey respondents to expense classification categories known as item codes.Development and AcquisitionNatural Language Processing, Random Forest, ClassificationDepartment of Labor
DOS-0000-2023DOSAFederal Procurement Data System (FPDS) Auto-Populate BotA/LM collaborated with A/OPE to develop a bot to automate the data entry in the Federal Procurement Data System (FPDS), reducing the burden on post’s procurement staff and driving improved compliance on DATA Act reporting. This bot is now used to update ~300 FPDS awards per week.  A/LM also partnered with WHA to develop a bot to automate closeout reminders for federal assistance grants nearing the end of the period of performance and begin developing bots to automate receiving report validation and customer service inbox monitoring.Department of State
DOS-0001-2023DOSAProduct Service Code Automation ML ModelA/LM developed a machine learning model to scan unstructured, user entered procurement data such as Requisition Title and Line Descriptions to automatically detect the commodity and services types being purchased for enhanced procurement categorization.Department of State
DOS-0002-2023DOSATailored Integration Logistics Management System (ILMS) User AnalyticsA/LM plans to use available ILMS transactional data and planned transactions to develop tailored user experiences and analytics to meet the specifics needs of the user at that moment. By mining real system actions and clicks we can extract more meaningful information about our users to simplify their interactions with the system and reduce time to complete their daily actions.Department of State
DOS-0003-2023DOSASupply Chain Fraud and Risk ModelsA/LM plans to expand current risk analytics through development of AI/ML models for detecting anomalous activity within the Integrated Logistics Management System (ILMS) that could be potential fraud or malfeasance. The models will expand upon existing risk models and focus on key supply chain functions such as: Asset Management, Procure-to-Pay, and Fleet Management.Department of State
DOS-0004-2023DOSATailored Integration Logistics Management System (ILMS) Automated User Support BotILMS developed and deployed an automated support desk assistant using ServiceNow Virtual Agent to simplify support desk interactions for ILMS customers and to deflect easily resolved issues from higher cost support desk agents.Department of State
DOS-0005-2023DOSAWithin Grade Increase AutomationA Natural Language Processing (NLP) model is used in coordination with Intelligent Character Recognition (ICR) to identify and extract values from the JF-62 form for within grade increase payroll actions. Robotic Process Automation (RPA) is then used to validate the data against existing reports, then create a formatted file for approval and processing.Department of State
DOS-0006-2023DOSAVerified Imagery Pilot ProjectThe Bureau of Conflict and Stabilization Operations ran a pilot project to test how the use of a technology service, Sealr, could verify the delivery of foreign assistance to conflict-affected areas where neither U.S. Department of State nor our implementing partner could go.  Sealr uses blockchain encryption to secure photographs taken on smartphones from digital tampering.  It also uses artificial intelligence to detect spoofs, like taking a picture of a picture of something.  Sealr also has some image recognition capabilities.  The pilot demonstrated technology like Sealr can be used as a way to strengthen remote monitoring of foreign assistance to dangerous or otherwise inaccessible areas.Department of State
DOS-0007-2023DOSAConflict ForecastingCSO/AA is developing a suite of conflict and instability forecasting models that use open-source political, social, and economic datasets to predict conflict outcomes including interstate war, mass mobilization, and mass killings. The use of AI is confined to statistical models including machine learning techniques including tree-based methods, neural networks, and clustering approaches.Department of State
DOS-0008-2023DOSCGFSAutomatic Detection of Authentic MaterialThe Foreign Service Institute School of Language Studies is developing a tool for automated discovery of authentic native language texts classified for both topic and Interagency Language Roundtable (ILR) proficiency level to support foreign language curriculum and language testing kit development.Department of State
DOS-0009-2023DOSCSOAutomated Burning DetectionThe Village Monitoring System program uses AI and machine learning to conduct daily scans of moderate resolution commercial satellite imagery to identify anomalies using the near-infrared band.Department of State
DOS-0010-2023DOSCSOAutomated Damage AssessmentsThe Conflict Observatory program uses AI and machine learning on moderate and high-resolution commerical satellite imagery to document a variety of war crimes and other abuses in Ukraine, including automated damage assessments of a variety of buildings, including critical infrastructure, hospitals, schools, crop storage facilities.Department of State
DOS-0011-2023DOSCSOServiceNow AI-Powered Virtual Agent (Chatbot)IRM’s BMP Systems is planning to incorporate ServiceNow’s Virtual Agent into our existing applications to connect users with support and data requests. The Artificial Intelligence (AI) is provided by ServiceNow as part of their Platform as a Service (PaaS).Department of State
DOS-0012-2023DOSCSOApptioWorking Capital Fund (IRM/WCF) uses Apptio to bill bureaus for consolidated services run from the WCF. Cost models are built in Apptio so bureaus can budget for the service costs in future FYs. Apptio has the capability to extrapolate future values using several available formulas.Department of State
DOS-0013-2023DOSFNLP for Foreign Assistance Appropriations AnalysisNatural language processing application for F/RA to streamline the extraction of earmarks and directives from the annual appropriations bill. Before NLP this was an entirely manual process.Department of State
DOS-0014-2023DOSFSIeRecords M/L Metadata EnrichmentThe Department’s central eRecords archive leverages machine learning models to add additional metadata to assist with record discovery and review. This includes models for entity extraction, sentiment analysis, classification and identifying document types.Department of State
DOS-0015-2023DOSGPAFacebook Ad Test Optimization SystemGPA’s production media collection and analysis system that pulls data from half a dozen different open and commercial media clips services to give an up-to-date global picture of media coverage around the world.Department of State
DOS-0016-2023DOSGPAGlobal Audience Segmentation FrameworkA prototype system that collects and analyzes the daily media clips reports from about 70 different Embassy Public Affairs Sections.Department of State
DOS-0017-2023DOSGPAMachine-Learning Assisted Measurement and Evaluation of Public OutreachGPA’s production system for collecting, analyzing, and summarizing the global digital content footprint of the Department.Department of State
DOS-0018-2023DOSGPAGPATools and GPAIXGPA’s production system for testing potential messages at scale across segmented foreign sub-audiences to determine effective outreach to target audiences.Department of State
DOS-0019-2023DOSIRMAI Capabilities Embedded in SMARTModels have been embedded in the backend of the SMART system on OpenNet to perform entity extraction of objects within cables, sentiment analysis of cables, keyword extraction of topics identified within cables, and historical data analysis to recommend addressees and passlines to users when composing cables.Department of State
DOS-0020-2023DOSPMNLP to pull key information from unstructured textUse NLP to extract information such as country names and agreement dates from dozens of pages of unstructured pdf documentDepartment of State
DOS-0021-2023DOSPMK-Means clustering into tiersCluster countries into tiers based off data collected from open source and bureau data using k-means clusteringDepartment of State
DOS-0022-2023DOSROptical Character Recognition – text extractionExtract text from images using standard python libraries; inputs have been websites to collect dataDepartment of State
DOS-0023-2023DOSRTopic ModelingCluster text into themes based on frequency of used words in documents; has been applied to digital media articles as well as social media posts; performed using available Python librariesDepartment of State
DOS-0024-2023DOSRforecastingusing statistical models, projecting expected outcome into the future; this has been applied to COVID cases as well as violent events in relation to tweetsDepartment of State
DOS-0025-2023DOSRDeepfake DetectorDeep learning model that takes in an image containing a person’s face and classifies the image as either being real (contains a real person’s face) or fake (synthetically generated face, a deepfake often created using Generative Adversarial Networks).Department of State
DOS-0026-2023DOSRSentiBERTIQGEC A&R uses deep contextual AI of text to identify and extract subjective information within the source material. This sentiment model was trained by fine-tuning a multilingual, BERT model leveraging word embeddings across 2.2 million labeled tweets spanning English, Spanish, Arabic, and traditional and simplified Chinese. The tool will assign a sentiment to each text document and output a CSV containing the sentiment and confidence interval for user review.Department of State
DOS-0027-2023DOSRTOPIQGEC A&R’s TOPIQ tool automatically classifies text into topics for analyst review and interpretation. The tool uses Latent Dirichlet Allocation (LDA), a natural language processing technique that uncovers a specified number of topics from a collection of documents, and then assigns the probability that each document belongs to a topic.Department of State
DOS-0028-2023DOSRText SimilarityGEC A&R’s Text Similarity capability identified different texts that are identical or nearly identical by calculating cosine similarity between each text. Texts are then grouped if they share high cosine similarity and then available for analysts to review further.Department of State
DOS-0029-2023DOSRImage ClusteringUses a pretrained deep learning model to generate image embeddings, then uses hierarchical clustering to identify similar images.Department of State
DOS-0030-2023DOSRLouvain Community DetectionTakes in a social network and clusters nodes together into “communities” (i.e., similar nodes are grouped together)Department of State
DOT-0000-2023DOTANGRemote Oceanic Meteorological Information Operations (ROMIO)ROMIO is an operational demonstration to evaluate the feasibility to uplink convective weather information to aircraft operating over the ocean and remote regions. Capability converted weather satellite data, lightning and weather prediction model data into areas of thunderstorm activity and cloud top heights. AI is used to improve the accuracy of the output based on previous activity compared to ground truth data.Technical transfer of capability to industry planned this summer.AI, ML via a Convolutional Neural NetworkDepartment of Transportation
DOT-0001-2023DOTANGDetermining Surface Winds with Machine Learning SoftwareSuccessfully demonstrated use of an AI capability to analyze camera images of a wind sock to produce highly accurate surface wind speed and direction information in remote areas that don’t have a weather observing sensor.Successfully tested but not in production.AIDepartment of Transportation
DOT-0002-2023DOTATOSurface Report Classifiier (SCM/Auto-Class)SCM classifies surface incident reports by event type, such as Runway Incursion, Runway Excursion, Taxiway Incursion/Excursion and categorizes runway incursions further by severity type (Category A, B, C, D, E)Refinments planned for future releaseSupport Vector Machines, Gradient boosting, neural networks, natural language processingDepartment of Transportation
DOT-0003-2023DOTATOAutomated Delay detection using voice processingIn order to get a full accounting of delay, automated voice detection of ATC and aircraft interaction is required. Many delay events, such as vectoring, are not currently reported/detected/accounted for and voice detection would enable automated detection.Initial developmentNatural Language Processing;Department of Transportation
DOT-0004-2023DOTAVSRegulatory Compliance Mapping ToolThe AVS International office is required to identify means of compliance to ICAO Standards and Recommended Practices (SARPs).  Both SARPs and means of compliance evidence are text paragraphs scattered across thousands of pages of documents.  AOV identified a need to find each SARP, evaluate the text of many FAA Orders, and suggest evidence of compliance based upon the evaluation of the text.  The base dataset used by RCMT is the documents’ texts deconstructed into paragraphs.  RCMT processes all the documents’ paragraphs run through Natural Language Processing (NLP) (this process has an AI aspect) to extract the meaning (semantics) of the text.    RCMT then employs a recommender system (also using some AI technology) to take the texts augmented by the texts’ meaning to establish candidate matches between the ICAO SARPs and FAA text that provides means of compliance.User Acceptance Testing to begin early spring '22ML (Recommender Algorithim) , NLP,Department of Transportation
DOT-0005-2023DOTAVSJASC Code classification in Safety Difficulty Reports (SDR)AVS identified a need to derive the joint aircraft system codes (JASC) chapter codes from the narrative description within service difficulty reports (SDR), a form of safety event reporting from aircraft operators. A team of graduate students at George Mason University collaborated with AVS employees to apply Natural Language Processing (NLP) and Machine Learning to predict JASC codes. This method can be used to check SDR entries to ensure the correct codes were provided or to assign a code when one was not.NLP, ML ClassificationDepartment of Transportation
DOT-0006-2023DOTAviation Safety (AVS)Course Deviation Identification for Multiple Airport Route Separation (MARS)The Multiple Airport Route Separation (MARS) program is developing a safety case for reduced separation standards between Performance Based Navigation (PBN) routes in terminal airspace. These new standards may enable deconfliction of airports in high-demand metropolitan areas, including the Northeast Corridor (NEC), North Texas, and Southern California. To build necessary collision risk models for the safety case, several models are needed, including one that describes the behavior of aircraft that fail to navigate the procedure correctly. These events are very rare and difficult to identify with standard data sources. Prior work has used Machine Learning to filter incident data to identify similar events on departure procedures.Python in Jupyter Labs, ML, NLPDepartment of Transportation
DOT-0007-2023DOTNSR Human Injury Research DivisionMachine Learning for Occupant Safety ResearchDescription: Utilize deep learning models for predicting head kinematics directly from crash videos. The utilization of deep learning techniques enables the extraction of 3D kinematics from 2D views, offering a viable alternative for calculating head kinematics in the absence of sensors or when sensor availability is inadequate, and when high-quality sensor data is absent
Input: Vehicle crash videos
Output: Angular velocity - injury prediction"Proof of Concept completed and publishedDeep learning models - Convolutional Neural Networks, Long-Short Term Memory based Recurrent Neural NetworksDepartment of Transportation
DOT-0008-2023DOTNSR Human Injury Research DivisionMachine Learning for Occupant Safety ResearchDescription: Utilize deep learning for predicting crash parameters, Delta-V (change in velocity) and PDOF (principal direction of force), directly from real-world crash images. Delta-V and PDOF are two most important parameters affecting injury outcome. Deep learning models can help predict both Delta-V and PDOF, without the need to run WinSmash software for Delta-V computation, and without requiring estimations by crash examiners. Moreover, with deep learning models, the Delta-V and PDOF can be obtained within milliseconds, providing rapid results for improved efficiency"
Input: Real world crash images
Output: Delta-V & PDOF"Currently under developmentDeep learning models - Convolutional Neural NetworksDepartment of Transportation
DOT-0009-2023DOTNextGen (ANG)Offshore Precipitation Capability (OPC)OPC leverages data from several sources such as weather radar, lightning networks, satellite and numerical models to produce a radar-like depiction of precipitation. The algorithm then applies machine learning techniques based on years of satellite and model data to improve the accuracy of the location and intensity of the precipitation areas.OPC runs in a pseudo-operational capacity via a webpage maintained by the Massachusetts Institute of Technology - Lincoln Lab, as well as in a test and evaluation capacity in a research mode.AI, ML via a Convolutional Neural NetworkDepartment of Transportation
DOT-0010-2023DOTOffice of Research, Development and TechnologyDevelopment of Predictive Analytics Using Autonomous Track Geometry Measurement System (ATGMS) DataDescription: Leveraging large volumes of these recursive track geometry measurements to develop and implement automated machine-learning-based processes for analyzing, predicting, and reporting track locations of concern, including those with significant rates of degradation.
Input: Track geometry measurements and exceptions
Output: Inspection report that includes the trending of track geometry measures and time to failure (i.e.maintenance and safety limits)."Department of Transportation
DOT-0011-2023DOTOffice of Research, Development and TechnologyCrushed Aggregate Gradation Evaluation SystemDescription: Deep learning computer vision algorithms aimed at analyzing aggregate particle size grading.
Input: Images of ballast cross sections
Output: Ballast fouling index"Department of Transportation
DOT-0012-2023DOTOffice of Research, Development and TechnologyAutomatic Track Change Detection Demonstration and AnalysisDescription: DeepCNet-based neural network to identify and classify track-related features (e.g., track components, such as fasteners and ties) for "change detection" applications.
Input: Line-scan images from rail-bound inspection systems
Output: Notification of changes from status quo or between different inspections based on geolocation."Department of Transportation
DOT-0013-2023DOTPHMSA Office of Chief Counsel (PHC)PHMSA Rule MakingArtificial Intelligence Support for Rulemaking - Using ChatGPT to support the rulemanking processes to provide significant efficiencies, reduction of effort, or the ability to scale efforts for unusual levels of public scrutiny or interest (e.g. comments on a rulemaking). ChatGPT will be used to provide:
1. Sentiment Analysis – Is the comment positive / negative or neutral towards the proposed rule.
2. Relevance Analysis – Whether the particular comment posted is relevant to the proposed rule
3. Synopsis of the posted comment.
4. Cataloging of comments.
5. Identification of duplicate comments."This is a pilot initiativeChatGPT, NLPDepartment of Transportation
ED-0000-2023EDFederal Student AidAidan Chat-botFSA's virtual assistant uses natural language processing to answer common financial aid questions and help customers get information about their federal aid on StudentAid.gov.
In just over two yearsAidan has interacted with over 2.6 million unique customersresulting in more than 11 million user messages."Department of Education
EPA-0000-2023EPAUse of random forest model to predict exposure pathwaysPrioritizing the potential risk posed to human health by chemicals requires tools that can estimate exposure from limited information. In this study, chemical structure and physicochemical properties were used to predict the probability that a chemical might be associated with any of four exposure pathways leading from sources-consumer (near-field), dietary, far-field industrial, and far-field pesticide-to the general population. The balanced accuracies of these source-based exposure pathway models range from 73 to 81%, with the error rate for identifying positive chemicals ranging from 17 to 36%. We then used exposure pathways to organize predictions from 13 different exposure models as well as other predictors of human intake rates. We created a consensus, meta-model using the Systematic Empirical Evaluation of Models framework in which the predictors of exposure were combined by pathway and weighted according to predictive ability for chemical intake rates inferred from human biomonitoring data for 114 chemicals. The consensus model yields an R2 of ∼0.8. We extrapolate to predict relevant pathway(s), median intake rate, and credible interval for 479 926 chemicals, mostly with minimal exposure information. This approach identifies 1880 chemicals for which the median population intake rates may exceed 0.1 mg/kg bodyweight/day, while there is 95% confidence that the median intake rate is below 1 μg/kg BW/day for 474572 compounds.
Consensus Modeling of Median Chemical Intake for the U.S. Population Based on Predictions of Exposure Pathways"U.S. Environmental Protection Agency
EPA-0001-2023EPARecords CategorizationThe records management technology team is using machine learning to predict the retention schedule for records. The machine learning model will be incorporated into a records management application to help users apply retention schedules when they submit new records.U.S. Environmental Protection Agency
EPA-0002-2023EPAEnforcement TargetingEPA’s Office of Compliance, in partnership with the University of Chicago, built a proof-of-concept to improve enforcement of environmental regulations through facility inspections by the EPA and state partners. The resulting predictive analytics showed a 47% improvement of identifying violations of the Resource Conservation and Recovery Act.U.S. Environmental Protection Agency
GSA-0000-2023GSAFASAcquisition AnalyticsTakes Detailed Data on transactions and classifies each transaction within the Government-wide Category Management TaxonomyOperation and MaintenanceSupervised Machine Learning - ClassificationU.S. General Services Administration
GSA-0001-2023GSAFASCity Pairs Program Ticket Forecast and Scenario Analysis ToolsTakes segment-level City Pair Program air travel purchase data and creates near-term forecasts for the current and upcoming fiscal year by month and at various levels of granularity including DOD vs Civilian, Agency, and Region.Development and AcquisitionTime Series Forecasthttps://github.helix.gsa.gov/EDA/City_Pair_Program_Forecast.gitU.S. General Services Administration
GSA-0002-2023GSAFASCategory Taxonomy Refinement Using NLPUses token extraction from product descriptions more accurately shape intended markets for Product Service Codes (PSCs).Operation and MaintenanceNLPU.S. General Services Administration
GSA-0003-2023GSAFASKey KPI Forecasts for GWCMTakes monthly historical data for underlying components used to calculate KPIs and creates near-term forecasts for the upcoming fiscal year. Pilot effort focuses on total agency/category spend (the denominator in multiple KPIs). If the pilot program is successful, the same methodology can be extended to other KPIs.ImplementationTime Series ForecastU.S. General Services Administration
GSA-0004-2023GSAFAS (QP0A)Contract Acquisition Lifecycle Intelligence (CALI)CALI tool is an automated machine learning evaluation tool built to streamline the evaluation of vendor proposals against the solicitation requirements to support the Source Selection process. Once the Contracting Officer (CO) has received vendor proposals for a solicitation and is ready to perform the evaluation process, the CO will initiate evaluation by sending solicitation documents along with all associated vendor proposal documents to the Source Selection module, which will pass all documents to CALI. CALI will process the documents, associated metadata and begin analyzing the proposals in four key areas: format compliance, forms validation, reps & certs compliance, and requirements compliance. The designated evaluation members can review the evaluation results in CALI and submit finalized evaluation results back to the Source Selection module. CALI is currently being trained with sample data from the EULAs under the Multiple Award Schedule (MAS) program.ImplementationNatural Language ProcessingU.S. General Services Administration
GSA-0005-2023GSAFAS / GSA IT (IC)Chatbot for Federal Acquisition CommunityThe introduction of a chatbot will enable the GSA FAS NCSC (National Customer Support Center) to streamline the customer experience process, and automate providing answers to documented commonly asked questions through public facing knowledge articles. The end goal is this will reduce staffing requirements for NCSC’s live chat programs and allow the NCSC resources to be dedicated to other proactive customer services initiatives. Customers will still have the option to connect to a live agent if they choose by requesting an agent.Operation and MaintenanceVirtual assistant; Natural Language Processing (NLP)U.S. General Services Administration
GSA-0006-2023GSAGSA IT (IC)Document Workflow / Intelligent Data Capture and ExtractionGSA is driving towards a more accurate and scalable document workflow platform. GSA seeks to intelligently capture, classify, and transfer critical data from unstructured and structured documents, namely PDF files, to the right process, workflow, or decision engine.Operation and MaintenanceIntelligent Document Recognition (IDR); Optical Character Recognition (OCR); Intelligent Character Recognition (ICR); Optical Mark Reading (OMR); Barcode Recognition; Robotic Process Automation (RPA); API Automation; Machine Learning;U.S. General Services Administration
GSA-0007-2023GSAGSA IT (IDT)Service Desk Generic Ticket ClassificationWe are building a model to take generic Service Desk tickets and classify them so that they can be automatically re-routed to the correct team that handles these types of tickets. The process of re-routing generic tickets is currently done manually, so the model will allow us to automate it. The initial model will target the top 5 most common ticket types.ImplementationNatural Language ProcessingU.S. General Services Administration
GSA-0008-2023GSAGSA IT (IDT)Service Desk Virtual Agent (Curie)Virtual agent that uses ML to provide predictive results for chat entries. A natural language chatbot (virtual assistant), we named Curie, as part of a multi-model customer service experience for employee's IT service requests leveraging knowledge-based articles.Operation and MaintenanceAssisted ML; Natural Language ProcessingU.S. General Services Administration
GSA-0009-2023GSAOGPSolicitation Review Tool (SRT)The SRT intakes SAM.gov data for all Information and Communications Technology (ICT) solicitations. The system then compiles the data into a database to be used by machine learning algorithms. The first of these is a Natural Language Processing model that determines if a solicitation contains compliance language. If a solicitation does not have compliance language, then it is marked as non-compliant. Each agency is asked to review their data and validate the SRT predictions. GSA also conducts random manual reviews monthly.Operation and MaintenanceNatural Language Processing (NLP); Intelligent Document Recognition (IDR); Optical Character Recognition (OCR); Intelligent Character Recognition (ICR); Robotic Process Automation (RPA); Machine Learninghttps://github.com/GSA/srt-apiU.S. General Services Administration
GSA-0010-2023GSATTSClassifying Qualitative DataUSAGov and USAGov en Español collect large amounts of qualitative data from survey comments, web searches and call center chat transcripts. Comments are grouped together by topic to determine where we need to make product updates/enhancementsOperation and MaintenanceNatural Language Processing (NLP)U.S. General Services Administration
GSA-0011-2023GSATTS/IAEIAE FSD CCAI Virtual AgentThe virtual agent uses manual learning to understand customer needs and provide a response appropriately. Our AI is named SAM and uses natural language.Operation and MaintenanceManual Learning/Natural LanguageU.S. General Services Administration
HHS-0000-2023HHSACFACF Children's BureauInformation Gateway OneReach ApplicationThe Information Gateway hotline connects to a phone IVR managed by OneReach AI. OneReach maintains a database of state hotlines for reporting child abuse and neglect that it can connect a caller to based on their inbound phone area code. Additionally, OneReach offers a limited FAQ texting service that utilizes natural language processing to answer user queries. User queries are used for reinforcement training by a human AI trainer and to develop additional FAQs.Operation and MaintenanceDepartment of Health and Human Services
HHS-0001-2023HHSAHRQAHRQAHRQ SearchOrganization wide search that includes Relevancy Tailoring, Auto-generation Synonyms, Automated Suggestions, Suggested Related Content ,Auto Tagging, and Did you mean to allow visitors to find specific contentOperation and MaintenanceDepartment of Health and Human Services
HHS-0002-2023HHSAHRQAHRQChatbotProvide interface to allow user to conversationally ask questions about AHRQ content to replace public inquiry telephone lineOperation and MaintenanceDepartment of Health and Human Services
HHS-0003-2023HHSASPRBARDA (CBRN & DRIVe)R+2:18eDIRECT: ClarivateAI to identify drug repurposing candidatesOperation and MaintenanceDepartment of Health and Human Services
HHS-0004-2023HHSASPRBARDA (CBRN & DRIVe)ReDIRECT: AriScienceAI to identify drug repurposing candidatesDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0005-2023HHSASPRBARDA (CBRN)Burn & Blast MCMs: RivannaAI Based algorithms on Accuro XV to detect and highlight fractures and soft tissue injuriesDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0006-2023HHSASPRBARDA (CBRN)Burn & Blast MCMs: PhilipsAI-based algorithms on Lumify handheld ultrasound system to detect lung injury and infectious diseasesDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0007-2023HHSASPRBARDA (CBRN)Burn & Blast MCMs: PhilipsAI-based algorithms on Lumify handheld ultrasound system to detect traumatic injuriesDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0008-2023HHSASPRBARDA (CBRN)Burn & Bast MCMs: SpectralMDDetermination of burn depth severity and burn size of injuriesDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0009-2023HHSASPRBARDA (DRIVe)Digital MCM: VirufyUsing forced cough vocalization (FCV) in a smartphone to detect the presence of COVID-19 using AI.Operation and MaintenanceDepartment of Health and Human Services
HHS-0010-2023HHSASPRBARDA (DRIVe)Current HealthContinuous monitoring platform and AI algorithm for COVID severityOperation and MaintenanceDepartment of Health and Human Services
HHS-0011-2023HHSASPRBARDA (DRIVe)Digital MCM: RaisonanceUsing forced cough vocalization (FCV) in a smartphone to detect the presence of COVID-19 and Influenza using AI.Development and AcquisitionDepartment of Health and Human Services
HHS-0012-2023HHSASPRBARDA (DRIVe)Digital MCM: Visual DxUsing smartphone image with AI to detect the presence of mPoxDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0013-2023HHSASPRBARDA (DRIVe)Host-Based Diagnostics: PatchdWearable device and AI model to predict sepsis at home.Development and AcquisitionDepartment of Health and Human Services
HHS-0014-2023HHSASPRChief Data OfficerData ModernizationDevelop open data management architecture that enables optimized business intelligence (BI) and machine learning (ML) on all ASPR data.InitiationDepartment of Health and Human Services
HHS-0015-2023HHSASPROffice of Critical InfrastructureCyber Threat Detection/ Predictive analyticsUse AI and ML tools for processing of extremely large threat dataInitiationDepartment of Health and Human Services
HHS-0016-2023HHSASPROffice of Information Management, Data and AnalyticsemPOWERUsing the AI capabilities to rapidly develop the empower COVID-19 At Risk Population data tools and programOperation and MaintenanceDepartment of Health and Human Services
HHS-0017-2023HHSASPROffice of Information Management, Data and Analytics/Division of Supply Chain Control TowerCommunity Access to TestingUtilizing several ML models to forecast a surge in the pandemicOperation and MaintenanceDepartment of Health and Human Services
HHS-0018-2023HHSASPROffice of Information Management, Data, and Analytics/Division of Modeling and SimulationModeling & SimulationCreate modeling tools and perform analyses in advance of biothreat events and be able to refine them during emergent eventsInitiationDepartment of Health and Human Services
HHS-0019-2023HHSASPROffice of Information Management, Data, and Analytics/Division of Supply Chain Control TowerVentilator Medication ModelLeveraging generalized additive model to project ventilated rate of COVID inpatientsOperation and MaintenanceDepartment of Health and Human Services
HHS-0020-2023HHSASPROffice of Information Management, Data, and Analytics/ODAProduct redistribution optimizationUsing AI and models, allow partners (jurisdictions, pharmacies, federal entities) to optimize redistribution of products based on various factors like distance, ordering/admins, equity, etc.Development and AcquisitionDepartment of Health and Human Services
HHS-0021-2023HHSASPROffice of Information Management, Data, and Analytics/ODAHighly Infectious Patient Movement optimizationGiven a limited number of highly infectious patient transport containers, optimize US location based on various factors like distance, population, etc. Use as a planning tool for decision-making.InitiationDepartment of Health and Human Services
HHS-0022-2023HHSCDCCSELSTowerScout:Automated cooling tower detection from aerial imagery for Legionnaires' Disease outbreak investigationTowerScout scans aerial imagery and uses object detection and image classification models to detect cooling towers, which can be sources of community outbreaks of Legionnaires' Disease.Operation and MaintenanceDepartment of Health and Human Services
HHS-0023-2023HHSCDCCSELSHaMLET: Harnessing Machine Learning to Eliminate TuberculosisHaMLET uses computer vision models to detect TB from chest x-rays to improve the quality of overseas health screenings for immigrants and refugees seeking entry to the U.S.Development and AcquisitionDepartment of Health and Human Services
HHS-0024-2023HHSCDCCSELSZero-shot learning to identify menstrual irregularities reported after COVID-19 vaccinationZero-shot learning was used to identify and classify reports of menstrual irregularities after receiving COVID-19 vaccinationOperation and MaintenanceDepartment of Health and Human Services
HHS-0025-2023HHSCDCNCCDPHP/DDTValidation Study of Deep Learning Algorithms to Explore the Potential Use of Artificial Intelligence for Public Health Surveillance of Eye DiseasesApplying deep learning algorithms for detecting diabetic retinopathy to the NHANES retinal photos. The purpose of this project is to determine whether these algorithms could be used in the future to replace ophthalmologist grading and grade retinal photos collected for surveillance purposes through the National Health and Nutrition Examination Survey (NHANES).Development and AcquisitionDepartment of Health and Human Services
HHS-0026-2023HHSCDCNCCDPHP/DNPAOAutomating extraction of sidewalk networks from street-level imagesA team of scientists participating in CDC's Data Science Upskilling Program are building a computer vision model to extract information on the presence of sidewalks from street-level images from Mapillary.Development and AcquisitionDepartment of Health and Human Services
HHS-0027-2023HHSCDCNCCDPHP/DNPAOIdentify walking and bicycling trips in location-based data, including global-positioning system data from smartphone applicationsThe Division of Nutrition, Physical Activity, and Obesity at the National Center for Chronic Disease Prevention and Health Promotion is developing machine learning techniques to identify walking and bicycling trips in GPS-based data sources. Inputs would include commercially-available location-based data similar to those used to track community mobility during the COVID-19 pandemic. Outputs could include geocoded data tables, GIS layers, and maps.InitiationDepartment of Health and Human Services
HHS-0028-2023HHSCDCNCCDPHP/DNPAOIdentify infrastructure supports for physical activity (e.g. sidewalks) in satellite and roadway imagesThe Division of Nutrition, Physical Activity, and Obesity at the National Center for Chronic Disease Prevention and Health Promotion is interested in developing and promoting machine learning techniques to identify sidewalks, bicycle lanes, and other infrastructure in images, both satellite and roadway images. The inputs would include image-based data. The outputs could be geocoded data tables, maps, GIS layers, or summary reports.InitiationDepartment of Health and Human Services
HHS-0029-2023HHSCDCNCCDPHP/DNPAOIdentifying state and local policy provisions that promote or inhibit creating healthy built environmentsThe Division of Nutrition, Physical Activity, and Obesity at the National Center for Chronic Disease Prevention and Health Promotion is interested in developing and promoting natural language processing and machine learning techniques to improve the efficiency of policy surveillance. Inputs are the text of state and local policies, including law (e.g., statute, legislation, regulation, court opinion), procedure, administrative action, etc. and outputs are datasets that capture relevant aspects of the policy as quantifiable information. To date (Apr 2023), DNAPO has not performed this work in-house, but is working with a contractor on various experiments comparing machine learning with traditional methods and identifying CDC, academic and other groups doing related work.InitiationDepartment of Health and Human Services
HHS-0030-2023HHSCDCNCEZIDUse of Natural Language Processing for Topic Modeling to Automate Review of Public Comments to Notice of Proposed RulemakingDevelopment of a Natural Language Processing Topic Modeling tool to improve efficiency for the process of clustering public comments to a 'notice of proposed rulemaking'Development and AcquisitionDepartment of Health and Human Services
HHS-0031-2023HHSCDCNCHSSemi-Automated Nonresponse Detection for Surveys (SANDS)NCHS has developed and release an item nonresponse detection model, to identify cases of item nonresponse (e.g., gibberish, uncertain/don't know, refusals, or high-risk) among open-text responses to help improve survey data and question and questionnaire design. The system is a Natural Language Processing (NLP) model pre-trained using Contrastive Learning and fine-tuned on a custom dataset from survey responses.Operation and MaintenanceDepartment of Health and Human Services
HHS-0032-2023HHSCDCNCHSSequential Coverage Algorithm (SCA) and partial Expectation-Maximization (EM) estimation in Record LinkageCDC's National Center for Health Statistics (NCHS) Data Linkage Program has implemented both supervised and unsupervised machine learning (ML) techniques in their linkage algorithms. The Sequential Coverage Algorithm (SCA), a supervised ML algorithm, is used to develop joining methods (or blocking groups) when working with very large datasets. The unsupervised partial Expectation-Maximization (EM) estimation is used to estimate the proportion of pairs that are matches within each block. Both methods improve linkage accuracy and efficiency.Operation and MaintenanceDepartment of Health and Human Services
HHS-0033-2023HHSCDCNCHSCoding cause of death information on death certificates to ICD-10MedCoder ICD-10 cause of death codes to the literal text cause of death description provided by the cause of death certifier on the death certificate. This includes codes for the underlying and contributing causes of death.Operation and MaintenanceDepartment of Health and Human Services
HHS-0034-2023HHSCDCNCHSDetecting Stimulant and Opioid Misuse and Illicit UseAnalyze clinical notes to detect illicit use and miscue of stimulants and opioidsInitiationDepartment of Health and Human Services
HHS-0035-2023HHSCDCNCHSAI/ML Model Release StandardsNCHS is creating a set of model release standards for AI/ML projects that should be adhered to throughout the Center, and could serve as a starting point for broader standards across the AI/ML development lifecycle to be created at NCHS and throughout CDC.Development and AcquisitionDepartment of Health and Human Services
HHS-0036-2023HHSCDCNCHSPII detection using Private AINCHS has been evaluating Private AI's NLP solution designed to identify, redact, and replace PII in text data. This suite of models is intended to be used to safely identify and remove PII from free text data sets across platforms within the CDC network.Development and AcquisitionDepartment of Health and Human Services
HHS-0037-2023HHSCDCNCHSTranscribing Cognitive Interviews with WhisperCurrent transcription processes for cognitive interviews are limited. Manual transcription is time-consuming and the current automated solution is low quality. Recently, open-sourced AI models have been released that appear to perform substantially better than previous technologies in automated transcription of video/audio. Of note is the model by OpenAI named Whisper (publication, code, model card) which has been made available for under a fully permissive license. Although Whisper is currently considered state-of-the-art compared to other AI models in standard benchmarks, it has not been tested with cognitive interviews. We hypothesize Whisper will produce production quality transcriptions for NCHS. We plan to do a comparison against both VideoBank and a manual transcription. If the results are encouraging, we plan to transcribe all videos from the CCQDER archive.Development and AcquisitionDepartment of Health and Human Services
HHS-0038-2023HHSCDCNCHSNamed Entity Recognition for Opioid Use in Free Text Clinical Notes from Electronic Health RecordsA team of scientists participating in CDC's Data Science Upskilling Program are developing an NLP Named Entity Recognition model to detect the assertion or negation of opioid use in electronic medical records from the National Hospital Care SurveyDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0039-2023HHSCDCNCIPC/DIPNowcasting Suicide TrendsAn internal-facing, interactive dashboard incorporating multiple traditional and non-traditional datasets and a multi-stage machine learning pipeline to 'nowcast' suicide death trends nationally on a week-to-week basis.Operation and MaintenanceDepartment of Health and Human Services
HHS-0040-2023HHSCDCNCIRDNCIRD SmartFind ChatBots - Public and InternalDevelop conversational ChatBots (Public Flu, Public COVID-19 Vaccination, Internal Knowledge-Bot) that analyze free text questions entered by the public, healthcare providers, partners, and internal staff, and provide agency-cleared answers which best match the question. Developed in collaboration with Microsoft staff during COVID-19 pandemic using their Cognitive Services, Search,�QnA Maker, Azure Healthcare Bot, Power Automate, SharePoint, and webapps.Operation and MaintenanceDepartment of Health and Human Services
HHS-0041-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Amazon Lex and Amazon Polly for the Marketplace Appeals Call CenterCMS/OHI: Amazon Lex & Amazon Polly are used in conjunction with the Amazon Connect phone system (cloud based) for the Marketplace Appeals Call Center. Amazon Lex offers self-service capabilities with virtual contact center agents, interactive voice response (IVR), information response automation, and maximizing information by designing chatbots using existing call center transcripts. Amazon Polly turns text into speech, allowing the program to create applications that talk, and build entirely new categories of speech-enabled products.Operation and MaintenanceDepartment of Health and Human Services
HHS-0042-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Feedback Analysis Solution (FAS)The Feedback Analysis Solution is a system that uses CMS or other publicly available data (such as Regulations.Gov) to review public comments and/or analyze other information from internal and external stakeholders. The FAS uses Natural Language Processing (NLP) tools to aggregate, sort and identify duplicates to create efficiencies in the comment review process. FAS also uses machine learning (ML) tools to identify topics, themes and sentiment outputs for the targeted dataset.Operation and MaintenanceDepartment of Health and Human Services
HHS-0043-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Predictive Intelligence - Incident Assignment for Quality Service Center (QSC).Predictive Intelligence (PI) is used for incident assignment within the Quality Service Center (QSC). The solution runs on incidents created from the ServiceNow Service Portal (https://cmsqualitysupport.servicenowservices.com/sp_ess). The solution analyzes the short description provided by the end user in order to find key words with previously submitted incidents and assigns the ticket to the appropriate assignment group. This solution is re-trained with the incident data in our production instance every 3-6 months based on need.Operation and MaintenanceDepartment of Health and Human Services
HHS-0044-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Fraud Prevention System Alert Summary Report Priority ScoreThis model will use Medicare administrative, claims, and fraud alert and investigations data to predict the likelihood of an investigation leading to an administrative action (positive outcome), supporting CMS in prioritizing their use of investigations resources. This analysis is still in development and the final model type has not been determined yet.Development and AcquisitionDepartment of Health and Human Services
HHS-0045-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Center for Program Integrity (CPI) Fraud Prevention System Models (e.g. DMEMBITheftML, HHAProviderML)These models use Medicare administrative and claims data to identify potential cases of fraud, waste, and abuse for future investigation using random forest techniques. Outputs are used to alert investigators of the potential fraud scheme and associated providers.Operation and MaintenanceDepartment of Health and Human Services
HHS-0046-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Priority Score Model - ranks providers within the Fraud Prevention System using logistic regression based on program integrity guidelines.Inputs - Medicare Claims data, Targeted Probe and Educate (TPE) Data, Jurisdiction information
Output - ranks providers within the FPS system using logistic regression based on program integrity guidelines."Operation and MaintenanceDepartment of Health and Human Services
HHS-0047-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Priority Score Timeliness - forecast the time needed to work on an alert produced by Fraud Prevention System (Random Forest, Decision Tree, Gradient Boost, Generalized Linear Regression)Inputs - Medicare Claims data, TPE Data, Jurisdiction information
Output - forecast the time needed to work on an alert produced by FPS (Random ForestDecision TreeGradient BoostGeneralized Linear Regression)"Operation and MaintenanceDepartment of Health and Human Services
HHS-0048-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)CCIIO Enrollment Resolution and Reconciliation System (CERRS)CERRS AI for ClassificationOperation and MaintenanceDepartment of Health and Human Services
HHS-0049-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Central Data Abstraction Tool-Modernized (Modernized-CDAT)- Intake Process Automation (PA) ToolIntake PA uses advanced capabilities (NLP, OCR, AI, ML) to automate, modernize, and reduce manual efforts related to medical record review functions within MA RADV auditsOperation and MaintenanceDepartment of Health and Human Services
HHS-0050-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)CMS Connect (CCN)CCN AI for Global SearchOperation and MaintenanceDepartment of Health and Human Services
HHS-0051-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)CMS Enterprise Portal Services (CMS Enterprise Portal-Chatbot)CMS Enterprise Portal AI for Process Efficiency Improvement| Knowledge ManagementOperation and MaintenanceDepartment of Health and Human Services
HHS-0052-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Federally Facilitated Marketplaces (FFM)FFM AI for Anomaly Detection and Correction| Classification| Forecasting and Predicting Time SeriesInitiationDepartment of Health and Human Services
HHS-0053-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Marketplace Learning Management System (MLMS)MLMS AI for Language Interpretation and TranslationOperation and MaintenanceDepartment of Health and Human Services
HHS-0054-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Medicaid And CHIP Financial (MACFin) Anomaly Detection Model for DSH AuditMACFin AI team developed machine learning model to predict anomalies within DSH audit data. The model flags top outliers in the submitted DSH hospitals data in terms of extreme behavior in the data based on amounts and other characteristics of the data to isolate the most outliers in the data. For example, out of all DSH allocations, the model can identify the top 1-5% outliers in the data for further review and auditing. Such model facilitates targeted investigations for gaps and barriers. In addition, the model can support the process by minimizing overpayment and/or underpayment and amounts redistributionInitiationDepartment of Health and Human Services
HHS-0055-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Medicaid And CHIP Financial (MACFin) DSH Payment Forecasting modelForecasting model to predict future DSH payment (next 1 year) based on historical data and trends (ex: last 1-3 years). Multiple models were trained based on time series (i.e., statistical models) and machine learning based model and compared for best performance in terms of average means error on DSH payment amount across all hospitals. DSH data were highly disorganized, the team spent time cleaning and combing the data from over 6 years for all states to conduct full model implementation and meaningful analysis. Predicting future DSH payment facilitates early planning and recommendations around trends, redistributions, etc. Modified models can also be built to predict other DSH-related metrics like payment-to-uncompensated ratio, underpayment, or overpaymentInitiationDepartment of Health and Human Services
HHS-0056-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Performance Metrics Database and Analytics (PMDA)PMDA AI for Anomaly Detection and Correction| Language Interpretation and Translation| Knowledge ManagementInitiationDepartment of Health and Human Services
HHS-0057-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Relationships, Events, Contacts, and Outreach Network (RECON)RECON AI for Recommender System| Sentiment AnalysisOperation and MaintenanceDepartment of Health and Human Services
HHS-0058-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Risk Adjustment Payment Integrity Determination System (RAPIDS)RAPIDS AI for Classification| Process Efficiency ImprovementOperation and MaintenanceDepartment of Health and Human Services
HHS-0059-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Drug Cost Increase PredictionsUse Historical drug costs increases to predict future increasesInitiationDepartment of Health and Human Services
HHS-0060-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Brand vs Generic Market ShareAnalyze generic drugs compared to brand drugs over time and forecast future market shares based on Part D claims volumeInitiationDepartment of Health and Human Services
HHS-0061-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Drug cost anomaly detectionIdentify anomalies in drug costs on Part D claimsInitiationDepartment of Health and Human Services
HHS-0062-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Artificial Intelligence (AI) Explorers Program Pilot - Automated Technical Profile90 day Pilot is to engage in research and development to investigate applications in the generation of a machine-readable Automated Technical Profile for CMS systems with the goal of inferring the technology fingerprint of CMS projects based on multiple data sources at different stages of their development lifecycleDevelopment and AcquisitionDepartment of Health and Human Services
HHS-0063-2023HHSCenters for Medicare & Medicaid Services (CMS)Centers for Medicare & Medicaid Services (CMS)Artificial Intelligence (AI) Explorers Program Pilot - Section 508 accessibility Testing90 day Pilot is to better inform CMS technical leads and Application Development Organizations (ADOs) to conduct a comprehensive analysis on the data from the test result documents in support of the CMS Section 508 Program.Development and AcquisitionDepartment of Health and Human Services
HHS-0064-2023HHSFDACBER/OBPV/DABRAProcess Large Amount of Submitted Docket CommentsProvide an automated process to transfer, deduplicate, summarize and cluster docket comments using AI/MLImplementationDepartment of Health and Human Services
HHS-0065-2023HHSFDACBER/OBPV/DABRATo develop novel approaches to expand and/or modify the vaccine AESI phenotypes in order to further improve adverse event detectionDeveloping a BERT-like ML model to improve detection of adverse events of special interest by applying a clinical-oriented language models pre-trained using the clinical documents from UCSFImplementationDepartment of Health and Human Services
HHS-0066-2023HHSFDACBER/OBPV/DABRABEST Platform improves post-market surveillance efforts through the semi-automated detection, validation and reporting of adverse events.The BEST Platform employs a suite of applications and techniques to improve the detection, validation and reporting of biologics-related adverse events from electronic health records (EHRs). The Platform utilizes ML and NLP to detect potential adverse events, and extract the important features for clinicians to validate.ImplementationDepartment of Health and Human Services
HHS-0067-2023HHSFDACDER/Office of Generic DrugsDevelopment of Machine Learning Approaches to Population Pharmacokinetic Model Selection and Evaluation of Application to Model-Based Bioequivalence Analysis1. Development of a deep learning/reinforcement learning approach to population pharmacokinetic model selections
2. Implementation of an established Genetic algorithm approach to population pharmacokinetic model selections in Python."Development and AcquisitionDepartment of Health and Human Services
HHS-0068-2023HHSFDACDER/Office of Generic DrugsMachine-Learning based Heterogeneous Treatment Effect Models for Prioritizing Product-Specific Guidance DevelopmentIn this project, we propose to develop and implement a novel machine learning algorithm
for estimating heterogeneous treatment effects to prioritize PSG development.
Specificallywe propose three major tasks. Firstwe will address an important problem in
treatment effect estimation from observational datawhere the observed variables may
contain confoundersi.e.variables that affect both the treatment and the outcome. We
will build on recent advances in variational autoencoder to introduce a data-driven method
to simultaneously estimate the hidden confounders and the treatment effect. Secondwe
will evaluate our model on both synthetic datasets and previous treatment effect
estimation benchmarks. The ground truth data enable us to investigate model
interpretability. Thirdwe will validate the model with the real-world PSG data and explain
model output for a particular PSG via collaborating with FDA team. The real-world
datasets are crucial to validate our modelwhich may include Orange BookFDA��s PSGs
National Drug Code directory databaseRisk Evaluation and Mitigation Strategies
(REMS) data and IQVIA National Sales Perspectives that are publicly availableas well
as internal ANDA submission data."Development and AcquisitionDepartment of Health and Human Services
HHS-0069-2023HHSFDACDER/Office of Generic DrugsDeveloping Tools based on Text Analysis and Machine Learning to Enhance PSG Review Efficiency1. Develop a novel neural summarization model in tandem with information retrieval system, tailored for PSG review, with dual attention over both sentence-level and word-level outputs by taking advantage of both extractive and abstractive summarization.
2. Evaluate the new model with the PSG data and the large CNN/Daily Mail dataset.
3. Develop an open-source software package for text summarization model and the information retrieval system."Development and AcquisitionDepartment of Health and Human Services
HHS-0070-2023HHSFDACDER/Office of Generic DrugsBEAM (Bioequivalence Assessment Mate) - a Data/Text Analytics Tool to Enhance Quality and Efficiency of Bioequivalence AssessmentWe aim to develop BEAM using verified data analytics packages, text mining, and artificial intelligence (AI) toolsets (including machine learning (ML)), to streamline the labor-intensive work during BE assessments to facilitate high-quality and efficient regulatory assessments.
,Development and Acquisition,,,Department of Health and Human Services
HHS-0071-2023HHSFDACDER/Office of New DrugsApplication of Statistical Modeling and Natural Language Processing for Adverse Event AnalysisDrug-induced adverse events (AEs) are difficult to predict for early signal detection, and there is a need to develop new tools and methods to monitor the safety of marketed drugs, including novel approaches for evidence generation. This project will utilize natural language processing (NLP) and data mining (DM) to extract information from approved drug labeling that can be used for statistical modeling to determine when the selected AEs are generally labeled (pre- or post-market) and identify patterns of detection, such as predictive factors, within the first 3 years of marketing of novel drugs. This project is intended to increase our understanding of timing/early detection of AEs, which can be applied to targeted monitoring of novel drugs. Funding will be used to support an ORISE fellow.InitiationDepartment of Health and Human Services
HHS-0072-2023HHSFDACDER/Office of Pharmaceutical Quality (OPQ)Centers of Excellence in Regulatory Science and Innovation (CERSI) project - Leveraging AI for improving remote interactions.This project aims to improve four major areas identified by FDA, including transcription, translation, document and evidence management, and co-working space. Automatic speech recognition has been widely used in many applications. Its cutting-edge technology is transformer-based sequence to sequence (seq2seq) model, which is trained to generate transcripts autoregressively and has been fine-tuned on certain datasets. Using pre-trained language models directly may not be suitable because they might not work properly with different accents and specialized regulatory and scientific terminologies. This is because the models were trained on a specific type of data and may not be able to handle data that is significantly different from what they were trained on. To address this, researchers plan to manually read a set of video/audio to obtain their true transcripts, upon which they fine-tune the model to make it adapt to this new domain. Machine translation converts a sequence of text from one language to another. Researchers usually use a method called "seq2seq," where original text is codified into a language that a computer can understand. Then, we use this code to generate the translated version of the text. It's like a translator who listens to someone speak in one language and then repeats what they said in another language. Similarly, it is not appropriate to directly apply the existing pre-trained seq2seq models, because (a) some languages used in the FDA context might not exist in existing models. (b) domain specific terms used in FDA are very different from general human languages. To tackle these challenges, models are trained for some unusual languages and fine-tune pre-trained models for major languages. For both situations, researchers prepare high-quality training set labeled by experts. University of Maryland CERSI (M-CERSI) plans to build a system to manage different documents and evidence, by implementing three sub-systems: (a) document classifier, (b) video/audio classifier, and (c) an interactive middleware that connects the trained model at the backend and the input at the frontend. With this, all documents created during co-working can be shared and accessed by all participants.InitiationDepartment of Health and Human Services
HHS-0073-2023HHSFDACDER/Office of Strategic Programs (OSP)Opioid Data Warehouse Term Identification and Novel Synthetic Opioid Detection and Evaluation AnalyticsThe Term Identification and Novel Synthetic Opioid Detection and Evaluation Analytics use publicly available social media and forensic chemistry data to identify novel referents to drug products in social media text. It uses the FastText library to create vector models of each known NSO-related term in a large social media corpus, and provides users with similarity scores and expected prevalence estimates for lists of terms that could be used to enhance future data gathering efforts.Operation and MaintenanceDepartment of Health and Human Services
HHS-0074-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Artificial Intelligence-based Deduplication Algorithm for Classification of Duplicate Reports in the FDA Adverse Event Reports (FAERS)The deduplication algorithm is applied to nonpublic data in the FDA Adverse Event Reporting System (FAERS) to identify duplicate individual case safety reports (ICSRs). Unstructured data in free text FAERS narratives is processed through a natural language processing system to extract relevant clinical features. Both structured and unstructured data are then used in a probabilistic record linkage approach to identify duplicates. Application of the deduplication algorithm is optimized for processing entire FAERS database to support datamining.Development and AcquisitionDepartment of Health and Human Services
HHS-0075-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Information Visualization Platform (InfoViP) to support analysis of individual case safety reportsDeveloped the Information Visualization Platform (InfoViP) for post market safety surveillance, to improve the efficiency and scientific rigor of Individual Case Study Reports (ICSRs) review and evaluation process. InfoViP incorporates artificial intelligence and advanced visualizations to detect duplicate ICSRs, create temporal data visualization, and classify ICSRs for useability.Development and AcquisitionDepartment of Health and Human Services
HHS-0076-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Using Unsupervised Learning to Generate Code Mapping Algorithms to Harmonize Data Across Data SystemsThe goal of this project is to assess the potential of data��driven statistical methods for detecting and reducing coding differences between healthcare systems in Sentinel. Findings will inform development and deployment of methods and computational tools for transferring knowledge learned from one site to another and pave the way towards scalable and automated harmonization of electronic health records data.ImplementationDepartment of Health and Human Services
HHS-0077-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Augmenting date and cause of death ascertainment in observational data sourcesThe objective of this project is to develop a set of algorithms to augment assessment of mortality through probabilistic linkage of alternative data sources with EHRs. Development of generalizable approaches to improve death ascertainment is critical to improve validity of Sentinel investigations using mortality as an endpoint, and these algorithms may also be usable in supplementing death ascertainment in claims data as well. Specifically, we propose the following Aims.
Specific Aim 1: We propose to leverage online publicly available data to detect date of death for patients seen at two healthcare systems.
Specific Aim 2: We propose to augment cause of death data using healthcare system narrative text and administrative codes to develop probabilistic estimates for common causes of death"ImplementationDepartment of Health and Human Services
HHS-0078-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Scalable automated NLP-assisted chart abstraction and feature extraction toolThe overall goal of this study is to demonstrate the usability and value of currently available data sources and techniques in electronic medical records by harnessing claims and EHR data, including structured, semi-structured, and unstructured data, in a pharmacoepidemiology study. This study will use real-world longitudinal data from the Cerner Enviza Electronic Health Records (CE EHR) linked to claims with NLP technology applied to physician notes. NLP methods will be used to identify and contextualize pre-exposure confounding variables, incorporate unstructured EHR data into confounding adjustment, and for outcome ascertainment. Use case study; This study will seek to understand the relationship between use of montelukast among patients with asthma and neuropsychiatric events.InitiationDepartment of Health and Human Services
HHS-0079-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)MASTER PLAN Y4The overall mission of the Innovation Center is to integrate longitudinal patient-level EHR data into the Sentinel System to enable in-depth investigations of medication outcomes using richer clinical data than are generally not available in insurance claims data. The Master Plan lays out a five-year roadmap for the Sentinel Innovation Center to achieve this vision through four key strategic areas: (1) data infrastructure; (2) feature engineering; (3) causal inference; and (4) detection analytics. The projects focus on utilizing emerging technologies including feature engineering, natural language processing, advanced analytics, and data interoperability to improve Sentinel's capabilities.InitiationDepartment of Health and Human Services
HHS-0080-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Onboarding of EHR data partnersIn the currently proposed project (DI6), structured fields from EHRs and linked claims data from two identified commercial data partners will be converted to the Sentinel Common Data Model (SCDM). The SCDM is an organizing CDM that preserves the original information from a data source and has been successfully used in the Sentinel system for over a decade. While originally built for claims data, SCDM was expanded in 2015 to accommodate some information commonly found in EHRs in separate clinical data tables to capture laboratory test results of interest and vital signs. We selected the SCDM over other CDMs because data formatted in the SCDM enables analyses that can leverage the standardized active risk identification and analysis (ARIA) tools. Operationally, both Data Partners will share SCDM transformed patient-level linked EHR-claims data with the IC after quality assessments are passed. This is a substantial advantage in this early stage of understanding how to optimally analyze such data. It will allow Sentinel investigators to directly work with the data, adapt existing analytic programs, and test algorithms. In sum, transformation of structured data from the proposed sources to SCDM format will be a key first step for potential future incorporation of these Data Partners into Sentinel to provide access to EHR-claims linked data for >10 million patients, which will be critical to meet the need identified in the 5-year Sentinel System strategic plan of 2019.InitiationDepartment of Health and Human Services
HHS-0081-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Creating a development networkThis project has the following specific Aims:
Aim 1: To convert structured data from EHRs and linked claims into Sentinel Common Data Model at each of the participating sites
Aim 2: To develop a standardized process for storage of free text notes locally at each site and develop steps for routine meta data extraction from these notes for facilitating direct investigator access for timely execution of future Sentinel tasks"InitiationDepartment of Health and Human Services
HHS-0082-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Empirical evaluation of EHR-based signal detection approachesThis project will develop approaches for abstracting and combining structured and unstructured EHR data as well as expanding TBSS methods to also identify signals for outcomes identifiable only through EHR data (e.g. natural language processing, laboratory values).InitiationDepartment of Health and Human Services
HHS-0083-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Label comparison tool to support identification of safety-related changes in drug labelingA tool with AI capabilities used to assist humans in their review and comparison of drug labeling in PDF format to identify safety-related changes occurring over time. The FDA uses postmarket data to update drug labeling, which can include new and a broad range safety-related issues; safety updates may be added to various sections of drug labeling. The tool's BERT natural language processing was trained to identify potential text related to newly added safety issues between drug labeling.Development and AcquisitionDepartment of Health and Human Services
HHS-0084-2023HHSFDACDER/Office of Surveillance and Epidemiology (OSE)Artificial Intelligence (AI) Supported Annotation of FAERS ReportsDevelop a prototype software application to support the human�review of FAERS data by developing computational algorithms to semi-automatically categorizing FAERS reports into meaningful medication error categories based on report free text. Leveraged existing annotated reports and worked with subject matter experts to annotate subsets of FAERS reports, to generate initial NLP algorithms that can classify any report as being medication related and with an identified type of medication error. An innovative active learning approach was then used to annotate reports and build more robust algorithms for more accurate categorization.Development and AcquisitionDepartment of Health and Human Services
HHS-0085-2023HHSFDACDER/Office of Translational SciencesCommunity Level Opioid Use Dynamics Modeling and SimulationThe OUD project leverages artificial intelligence techniques, specifically Agent-Based Modeling (ABM), to design and carry out Community Level Opioid Use Dynamics Modeling and Simulation with a cohort of datasets and to investigate the propagation mechanisms involving various factors including geographical and social influences and more, and their impacts at a high level. The project also leveraged Machine Learning (ML), such as Classification, to identify data entry types (e.g., whether a particular data entry is entered by a person in the target population, e.g., a woman of child-bearing ages) as part of the training data generation task.InitiationDepartment of Health and Human Services
HHS-0086-2023HHSFDACDER/Office of Translational Sciences/Office of BiostatisticsAutomatic Recognition of Individuals by Pharmacokinetic Profiles to Identify Data AnomaliesIn efforts to detect data anomalies under ANDA, Office of Biostatistics, Division of Biometrics VIII created an R shiny application, DABERS (Data Anomalies in BioEquivalence R Shiny) to support OSIS and OGD. Despite its demonstrated effectiveness, a major drawback is that the pharmacokinetics and pharmacodynamics may be too complicated to describe with a single statistic. Indeed, the current practice offers no practical guidelines regarding how similar PK profiles from different subjects can be in order to be considered valid. This makes it difficult to assess the adequacy of data to be accepted for an ANDA and requires additional information requests to applicants. This project will address the current gap in identifying the data anomalies and potential data manipulations by use of state-of-the-art statistical methods, specifically focusing on machine learning and data augmentation. The purpose of the project is twofold. First, from a regulatory perspective, our project will provide a data driven method that can model complex patterns of PK data to identify potential data manipulations under an ANDA. Second, from a public health research and drug development point of view, the proposed study can potentially be used to understand and quantify the variability in drug response, to guide stratification and targeting of patient subgroups, and to provide insight into what the right drug and right range of doses are for those subgroups.Development and AcquisitionDepartment of Health and Human Services
HHS-0087-2023HHSFDACDER/Office of Translational Sciences/Office of BiostatisticsCluePoints CRADAThis project uses unsupervised machine learning to detect and identify data anomalies in clinical trial data at the site, country and subject levels. This project will consider multiple use cases with the goals of improving data quality and data integrity, assist site selection for inspection, and assist reviewers by identifying potentially problematic sites for sensitivity analyses.Development and AcquisitionDepartment of Health and Human Services
HHS-0088-2023HHSFDACDER/Office of Translational Sciences/Office of Clinical PharmacologyClinical Study Data Auto-transcribing Platform (AI Analyst) for Generating Evidence to Support Drug LabellingThe AI Analyst platform is trained to auto-author clinical study reports from the source data to assess the strength and robustness of analytical evidence for supporting drug labelling languages. The platform directly transcribes SDTM (Study Data Tabulation Model) datasets of phase I/II studies into full-length clinical study reports autonomously with minimal human input. The underlying AI algorithm mimics the subject matter experts (e.g., clinicians, statisticians, and data managers) thinking process to decipher the full details of study design and conduct, and interpret the study results according to the study design. It consists of multiple layers of data pattern recognitions. The algorithm addresses the challenging nature of assessing clinical study results, including huge variety of study designs, unpredictable study conduct, variations of data reporting nomenclature/format, and wide range of study-specific analysis methods. The platform has been trained and tested with hundreds of NDA/BLA submissions and over 1500 clinical trials. The compatible study types include most drug label supporting studies, such as drug interaction, renal/hepatic impairment, and bioequivalence. In 2022, the Office of Clinical Pharmacology (OCP/OTS/CDER) initiated the RealTime Analysis Depot (RAD) project aiming to routinely apply the AI platform to support the review of NME, 505b2 and 351K submissions.ImplementationDepartment of Health and Human Services
HHS-0089-2023HHSFDACFSAN /OFASData Infrastructure Backbone for AI applicationsOFAS is creating a data lake (WILEE knowledgebase) that ingests and integrates data from a variety of data sources to assist our use of advance analytics in driving risked based decision making. The sources of data include, internal stakeholder submission data, data generated by OFAS staff, scientific information from PubMed, NIH and other scientific publications, CFSAN generated data such as the total diet study, news articles and blog posts, publications from sister agencies, food ingredient and packaging data, food sales data etc. The design of this data store allows for the automated ingestion of new data while allowing for manual curation where necessary. It is also designed to enable the identification, acquisition and integration of new data sources as they become available. The design of the data lake centralizes information about CFSAN regulated products, food additives, color additives, GRAS substances and food contact substance and integrates the different sources of information with stakeholder submission information contained in FARM and cheminformatics information in CERES enabling greater insights and a more efficient knowledge discovery during review of premarket submissions and post market monitoring of the U.S food supply.Operation and MaintenanceDepartment of Health and Human Services
HHS-0090-2023HHSFDACFSAN/OFASAI Engine for Knowledge discovery, Post-market Surveillance and Signal DetectionThe use of Artificial Intelligence in post-market surveillance and signal detection will enhance CFSAN's ability to detect potential problems associated with CFSAN commodities, including leveraging data to investigate potential issues with chronic, long-term exposure to food additives, color additives, food contact substances and contaminants or long-term use of cosmetics. OFAS Warp Intelligent Learning Engine (WILEE) project seeks establish an intelligent knowledge discovery and analytic agent for the Office. WILEE (pronounced Wiley) provides a horizon-scanning solution, analyzing data from the WILEE knowledgebase, to enable the Office to maintain a proactive posture and the capacity to forecast industry trends so that the Office can stay ahead of the development cycle and prepare for how to handle a large influx of submissions (operational risk - e.g., change in USDA rules regarding antimicrobial residue levels in poultry processing), prioritize actions based on risk or stakeholder perceived risk regarding substances under OFAS purview (e.g., yoga mat incident). WILEE will provide the Office with an advanced data driven risked based decision-making tool, that leverages AI technologies to integrate and process a large variety of data sources, generating reports with quick insights that will significantly improve our time-to-results.ImplementationDepartment of Health and Human Services
HHS-0091-2023HHSFDACFSAN/OFASEmerging Chemical Hazard Intelligence Platform (ECHIP - completed)This is an AI solution designed to identify emerging, potential chemical hazards or emerging stakeholder concerns regarding potential hazards associated with substances of interest to CFSAN. Implementation of this solution will enable CFSAN to take proactive measures to protect and/or address concerns from our stakeholders. ECHIP uses data from the news and social media, and the scientific literature to identify potential issues that may require CFSAN's attention. Real world examples without the ECHIP AI solution have taken 2-4 weeks for signal identification and verification depending on the number of scientists dedicated to reviewing the open literature, news and social media. Results from pilot studies indicate that ECHIP could reduce the overall signal detection and validation process to about 2 hours. ECHIP accomplishes this reduction by automatically ingesting, reviewing, analyzing and presenting data from multiple sources to scientists in such a way that signal detection and verification can be done an a very short time period.Operation and MaintenanceDepartment of Health and Human Services
HHS-0092-2023HHSFDACTP/OS/DRSIOSCAROSCAR (Office of Science Customer Assistance Response) is a chatbot with predefined intents for customers to get help from Customer Service Center. It offers a 24/7 user interface allowing users to input questions and view previous responses, as well as a dashboard offering key metrics for admin users.Operation and MaintenanceDepartment of Health and Human Services
HHS-0093-2023HHSFDACTP/OS/DRSISSTATSelf-Service Text Analytics Tool (SSTAT) is used to explore the topics of a set of documents. Documents can be submitted to the tool in order to generate a set of topics and associated keywords. A visual listing of the documents and their associated topics is automatically produced to help quickly snapshot the submitted documents.Operation and MaintenanceDepartment of Health and Human Services
HHS-0094-2023HHSFDACTP/OS/DRSIASSIST4TOBACCOASSIST4Tobacco is a semantic search system that helps CTP stakeholders find tobacco authorization applications more accurately and efficiently.ImplementationDepartment of Health and Human Services
HHS-0095-2023HHSFDACVMUsing XGBoost Machine Learning Method to Predict Antimicrobial Resistance from WGS dataGenomic data and artificial intelligence/machine learning (AI/ML) are used to study antimicrobial resistance (AMR) in Salmonella, E. coli, Campylobacter, and Enterococcus, isolated from retail meats, humans, and food producing animals. The Boost Machine Learning Model (XGBoost) is implemented to improve upon categorical resistance vs susceptible predictions by predicting antimicrobial Minimum Inhibitory Concentrations (MICs) from WGS data.Development and AcquisitionDepartment of Health and Human Services
HHS-0096-2023HHSFDANCTRDevelopment of virtual animal models to simulate animal study results using Artificial Intelligence (AI)Testing data from animal models provides crucial evidence for the safety evaluation of chemicals. These data have been an essential component in regulating drug, food, and chemical safety by regulatory agencies worldwide including FDA. As a result, a wealth of animal data is available from the public domain and other sources. As the toxicology community and regulatory agencies move towards a reduction, refinement, and replacement (3Rs principle) of animal studies, we proposed an AI-based generative adversarial network (GAN) architecture to learn from existing animal studies so that it can generate animal data for new and untested chemicals without conducting further animal experiments. The FDA has developed guidelines and frameworks to modernize toxicity assessment with alternative methods, such as the FDA Predictive Toxicology Roadmap and the Innovative Science and Technology Approaches for New Drugs (ISTAND). These programs facilitate the development and evaluation of alternative methodologies to expand the FDA's toxicology predictive capabilities, to reduce the use of animal testing, and to facilitate drug development. A virtual animal model with capability of simulating animal studies could serve as an alternative to animal studies to support the FDA mission.InitiationDepartment of Health and Human Services
HHS-0097-2023HHSFDANCTRAssessing and mitigating bias in applying Artificial Intelligence (AI) based natural language processing (NLP) of drug labeling documentsAs use of AI in biomedical sciences increases, significant concerns are raised regarding bias, stereotype, or prejudice in some AI systems. An AI system trained on inappropriate or inadequate data may reinforce biased patterns and thus provide biased predictions. Particularly, when the AI model was trained on dataset from different domains and then transferred to a new application domain, the system needs to be evaluated properly to avoid potential bias risks.
Given the increased number of transfer learning and AI applications in document analysis to support FDA reviewthis proposal is to conduct a comprehensive study to understand and assess the bias in applying AI based natural language processing of drug labeling documentsand to the extension of developing a strategy to mitigate such a bias."InitiationDepartment of Health and Human Services
HHS-0098-2023HHSFDANCTRIdentify sex disparities in opioid drug safety signals in FDA adverse events report systems (FAERS) and social media Twitter to improve women healthThis proposal aims to address OWH 2023 Priority Area: Use of real world data and evidence to inform regulatory processes.
We propose to analyze sex differences in adverse events for opioid drugs in social media (Twitter) and the FDA Adverse Events Report Systems (FAERS). We will compare sex disparities identified from FAERS and Twitter to assess whether Twitter data can be used as an early warning system to signal the opioid-related issues specific to women. The identified sex disparities in adverse events for opioid drugs from this project could help improve women health."InitiationDepartment of Health and Human Services
HHS-0099-2023HHSFDANCTRPrediction of adverse events from drug - endogenous ligand - target networks generated using 3D-similarity and machine learning methods.Excluding areas of the biochemical space near activity cliffs [1], molecular similarity [2] has long proven to be an outstanding tool in virtual screening [3], absorption, distribution, metabolism, and excretion (ADME) [4], drug design [5] and toxicology [6]. Among these, the toxicological response is the most challenging task due to its immense complexity involving multiple pathways and protein targets. Although many adverse drug reactions (ADRs) result from genetic polymorphisms and factors such as the patient's medical history and the treatment dosage and regimen, on a fundamental level all ADRs are initiated by the binding of a drug molecule to a target, whether intended (therapeutic target) or non-intended (off-target interactions with promiscuous proteins) [7]. While molecular similarity approaches designed to identify off-target interaction sites have been explored since the late 2000s [8, 9], most have been focused on drug design, repurposing and more generally, efficacy, whereas relatively few have been applied to toxicology [10, 11].
Since there are multiple approaches to molecular similarity (structuralfunctionalwhole moleculepharmacophoreetc. [12])the performance of any of the above applications depends strongly on the metrics by which similarity is quantified. For the past 10 yearsDSB has been working on creating a universal molecular modeling approach utilizing unique three-dimensional fingerprints encoding both the steric and electrostatic fields governing the interactions between ligands and receptors. It has been demonstrated that these fingerprints could quantify reliably both the structural and functional similarities between molecules [1314] and their application for prediction of adverse events from AI generated drug - endogenous ligand - target networks could provide new insights into yet unknown mechanisms of toxicity."InitiationDepartment of Health and Human Services
HHS-0100-2023HHSFDANCTRPredictive toxicology models of drug placental permeability using 3D-fingerprints and machine learningThe human placenta plays a pivotal role in fetal growth, development, and fetal exposure to chemicals and therapeutics. The ability to predict placental permeability of chemicals during pregnancy is an important factor that can inform regulatory decisions related to fetal safety and clinical trials with women of child-bearing potential (WOCBP). The human placenta contains transport proteins, which facilitate the transfer of various endogenous substances and xenobiotics. Several mechanisms allow this transfer: i) passive diffusion, ii) active transport, iii) facilitated diffusion, iv) pinocytosis, and v) phagocytosis. Among these, passive and active transport are the two major routes. Small, non-ionized, highly lipophilic drugs cross the placenta via passive diffusion; however, relatively large molecules (MW > 500 Da) with low lipophilicity are carried by transporters. While prediction of the ability of drugs to cross the placenta via diffusion is straight-forward, the complexity of molecular interactions between drugs and transporters has proven to be a challenging problem to solve. Virtually, all QSARs (Quantitative Structure Activity Relationships) published to date model small datasets (usually not exceeding 100 drugs) and utilize weak validation strategies [1-5].
In this proposal3D-molecular similarities of endogenous placental transporter ligands to known drug substrates will be used to identify the most likely mode of drug transportation (active/passive) and build predictivequantitative and categorical 3D-SDAR models by linking their molecular characteristics to placental permeability. Permeability data will be collected via mining the literaturethe CDER databasesand conducting empirical assessments using in vitro NAMs with confirmation using rodent models. Predictability will be validated using: i) blind test sets including known controls and ii) a small set of drugs with unknown permeabilitieswhich will be tested in in vitro and in vivo models."InitiationDepartment of Health and Human Services
HHS-0101-2023HHSFDANCTROpioid agonists/antagonists knowledgebase (OAK) to assist review and development of analgesic products for pain management and opioid use disorder treatmentThe number of deaths caused by opioid overdose in the United States has been increasing dramatically for the last decade. misuse and abuse continue at alarmingly high rates. Opioid use disorder (OUD) often starts with use of prescription opioid analgesics. Therefore, the development of abuse-deterrent analgesic products may significantly impact the trajectory of the opioid crisis. In addition, FDA is making new efforts to support novel product innovation for pain management and the treatment of OUD to combat this opioid crisis.
Opioid agonists bind and activate opioid receptors to decrease calcium influx and cyclic adenosine monophosphate (cAMP)leading to hyperpolarization that inhibits pain transmission. Opioid antagonists bind and inhibit or block opioid receptors. Both opioid agonists and antagonists are used in drug products for pain management and treatment of opioid addiction. An opioid agonists/antagonists knowledgebase (OAK) would be useful for FDA reviewers to inform evaluation and to assist development of analgesics and of additional treatments for OUD.
To create a comprehensive OAKwe propose to curate the experimental data on opioid agonist/antagonist activity from the public domainexperimentally test some 2800 drugs in functional opioid receptor assays using quantitative high-throughput screen (qHTS) platformand develop and validate in silico models to predict opioid agonist/antagonist activity. The created OAK knowledgebase could be used for retrieving experimental opioid agonist/antagonist activity data and the related experimental protocols. For chemicals without experimental dataread-across methods could be used to find similar chemicals in OAK to estimate the opioid agonist/antagonist activityand the in silico models in OAK could be used to predict the opioid agonist/antagonist activity. The retrieved or predicted activity data can then be used to inform regulatory review or to assist in the development of analgesics."ImplementationDepartment of Health and Human Services
HHS-0102-2023HHSFDANCTRDevelopment of a Comprehensive Open Access Molecules with Androgenic Activity Resource (MAAR) to Facilitate Assessment of ChemicalsAndrogen receptor (AR) is a ligand-dependent transcription factor and a member of the nuclear receptor superfamily, which is activated by androgens. AR is the target for many drugs but it could also act as an off target for drugs and other chemicals. Therefore, detecting androgenic activity of drugs and other FDA regulated chemicals is critical for evaluation of drug safety and assessment of chemical risk. There is a large amount of androgenic activity data in the public domain, which could be an asset for the scientific community and regulatory science. However, the data are distributed across different and diverse sources and stored in different formats, limiting the use of the data in research and regulation. Therefore, a comprehensive, reliable resource to provide open access to the data and enable modeling and prediction of androgenic activity for untested chemicals is in urgent need. This project will develop a high-quality open access Molecules with Androgenic Activity Resource (MAAR) including data and predictive models fully compliant with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. MAAR can be used to facilitate research on androgenic activity of chemicals and support regulatory decision making concerning efficacy and safety evaluation of drugs and chemicals in the FDA regulated products.ImplementationDepartment of Health and Human Services
HHS-0103-2023HHSFDANCTRArtificial Intelligence (AI)-based Natural Language Processing (NLP) for FDA labeling documentsFDA has historically generated and continues to generate a variety of documents during the product-review process, which are typically unstructured text and often not follows the use of standards. Therefore, analysis of semantic relationships plays a vital role to extract useful information from the FDA documents to facilitate the regulatory science research and improve FDA product review process. The rapid advancement in artificial intelligence (AI) for Natural Language Processing (NLP) offers an unprecedent opportunity to analyze the semantic text data by using the language models that are trained with large biomedical corpus. This study is to assess the AI based NLP for the FDA documents with a focus on the FDA labeling documents. Specifically, we will apply the publicly available language models (e.g., BERT and BioBERT) to the FDA drug labeling documents available from the FDA Label tool that manages over 120K labeling documents including over 40K Human Prescription Drug and Biological Products. We will investigate three areas of AI applications that are important to the regulatory science research: (1) the interpretation and classification of drug properties (e.g., safety and efficacy) with AI reading, (2) text summarization to provide highlights of labeling sections, (3) automatic anomaly analysis (AAA) for signal identification, and (4) information retrieval with Amazon-like Questions/Answer. We will compare the AI based NLP with MedDRA based approach whenever possible for drug safety and efficacy. The study will provide a benchmark for fit-for-purpose application of the public language models to the FDA documents and, moreover, the outcome of the study could provide a scientific basis to support the future development of FDALabel tool which is widely used in CDER review process.ImplementationDepartment of Health and Human Services
HHS-0104-2023HHSFDANCTRInforming selection of drugs for COVID-19 treatment by big data analytics and artificial intelligenceThe pandemic of COVID-19 is the biggest global health concern currently. As of July 11, 2020, more than 12 million people have been tested positive of SARS-COV-2 virus infection and more than half million deaths have been caused by COVID-19 in the world. Currently, no vaccines and/or drugs have been proved to be effective to treat COVID-19. Therefore, many drug products on the market are being repurposed for the treatment of COVID-19. However, sufficient evidence is needed to determine that the repurposed drugs are safe and effective. Therefore, safety information on the drugs selected for repurposing purpose is important. The proposed project aims to mine adverse drug events using artificial intelligence and big data analytics in the public domain including the agency's database, public databases, and social media data for the drugs to be repurposed for the treatment of COVID-19. The ultimate goal of this project is to provide detailed adverse event information that can be used to facilitate safety evaluation for drugs repurposed for the treatment COVID-19. The detailed adverse event information will be used to develop recommendations for selecting the right drugs for repurposing efforts and for help select the appropriate COVID-19 patients and thus better to combat the pandemic.ImplementationDepartment of Health and Human Services
HHS-0105-2023HHSFDANCTRTowards Explainable AI: Advancing Predictive Modeling for Regulatory UseArtificial Intelligence (AI) is a broad discipline of training machines to think and accomplish complex intellectual tasks like humans. It learns from existing data/information to predict future outcomes, distill knowledge, offer advices, or plan action steps. The rise of AI has offered both opportunities and challenges to FDA in two aspects: (1) how to assess and evaluate marketed AI-centric products and (2) how to implement AI methods to improve the agency's operation. One of the key aspects of both regulatory applications is to understand the underlying features driving the AI performance and to the extension of its interpretability in the context of application.
Different from the statistical evaluation (e.g.accuracysensitivity and specificity)model interpretability assessment lacks quantitative metrics. In most casesthe assessment tends to be subjectivewhere prior knowledge is often used as a ground-truth to explain the biological relevance of underlying featurese.g.whether the biomarkers featured by the model are in accordance with the existing findings. In realitythere is a trade-off between statistical performance and interpretability among different AI algorithmsand understanding the difference will improve the context of use of AI technologies in regulatory science.
For thatwe will investigate representative AI methodsin terms of their performance and interpretabilityfirst through benchmark datasets that have been well-established in the research communitythen extended to clinical/pre-clinical datasets. This project will provide basic parameters and offer an insightful guidance on developing explainable AI models to facilitate the real-world decision making in regulatory settings."ImplementationDepartment of Health and Human Services
HHS-0106-2023HHSFDANCTRIdentification of sex differences on prescription opioid use (POU)-related cardiovascular risks by big data analysis1) Prescription opioid use (POU) varies among patient population subgroups, such as gender, age, and ethnicity. POU can potentially cause various adverse effects in the respiratory, gastrointestinal, musculoskeletal, cardiovascular, immune, endocrine, and central nervous systems. Important sex differences have been observed in POU-associated cardiac endpoints. Currently, systematic knowledge is lacking for risk factors associated with the increased cardiotoxicity of POU in women. 2) Currently, the FDA utilizes two methods of analysis for data mining, the Proportional Reporting Ratio (PRR) and the Empirical Bayesian Geometric Mean (EBGM) to identify significant statistical associations between products and adverse events (AEs). These methods are not applicable when two or more reporting measures (e.g. gender, age, race, etc.) must be considered and compared. In this study, a novel statistical model will be developed to detect the safety signals when gender is considered as the third variable. Safety signals will then be detected and compared from combined multiple-layered real-world evidence in the form of EHRs from diverse sources. Sex-dependent differences in risk factors for cardiotoxicity from POU will be identified and analyzed using big data methods and AI-related tools. 3) The proposed project addresses the first of four priority areas of FDA's 2018 Strategic Policy Roadmap: Reduce the burden of addiction crises that are threatening American families, and two priority areas of Women's Health Research Roadmap: Priority Area 1: Advance Safety and Efficacy, and Priority Area 5: Expand Data Sources and Analysis. The results may provide information and knowledge to help the FDA drug reviewers and physicians be aware of sex differences to certain POU drugs and combinations of POU with other prescription drugs, therefore, preventing or reducing risk of the POU drug-induced CVD in women.Development and AcquisitionDepartment of Health and Human Services
HHS-0107-2023HHSFDANCTRNCTR/DBB-CDER/OCS collaboration on A SafetAI Initiative to Enhance IND Review ProcessThe development of animal-free models has been actively investigated and successfully demonstrated as an alternative to animal-based approaches for toxicity assessments. Artificial Intelligence (AI) and Machine learning (ML) have been the central engine in this paradigm shift to identify safety biomarkers from non-animal assays or to predict safety outcomes solely based on chemical structure data. AI is a computer system or algorithm that has the ability to learn from existing data to foresee the future outcome. ML, a subset of AI, has been specifically studied to make predictions for adverse drug reactions. Deep Learning (DL) is arguably the most advanced approach in ML which frequently outperforms other types of ML approaches (or conventional ML approaches) for the study of drug safety and efficacy. DL usually consists of multiple layers of neural networks to mimic the cognitive behaviors associated with the human brain learning and problem-solving process to solve data intensive problems. Among many studies using AI/ML, DL has become a default algorithm to consider due to its superior performance. This proposal will apply DL to flag safety concerns regarding drug-induced liver injury (DILI) and carcinogenicity during the IND review process.InitiationDepartment of Health and Human Services
HHS-0108-2023HHSNIHNational Institutes of Health (NIH) CCindividual Functional Activity Composite Tool (inFACT)inFACT is being developed for use in the Social Security Administration (SSA) disability determination process to assist adjudicators in identifying evidence on function from case records that might be hundreds or thousands of pages long. inFACT displays information on whole person function as extracted from an individual's free text medical records and aligned with key business elements.Development and AcquisitionDepartment of Health and Human Services
HHS-0109-2023HHSNIHNational Institutes of Health (NIH) CSRAssisted Referral ToolTo provide assistance in assigning appropriate scientific areas for grant applications.Operation and MaintenanceDepartment of Health and Human Services
HHS-0110-2023HHSNIHNational Institutes of Health (NIH) NCINanCI: Connecting ScientistsUses AI to match scientific content to users interests. By collecting papers into a folder a user can engage the tool to find similar articles in the scientific literature, and can refine the recommendations by up or down voting of recommendations. Users can also connect with others via their interests, and receive and make recommendations via this social network.Development and AcquisitionDepartment of Health and Human Services
HHS-0111-2023HHSNIHNational Institutes of Health (NIH) NHLBIDetection of Implementation Science focus within incoming grant applicationsThis tool uses natural language processing and machine learning to calculate an Implementation Science (IS) score that is used to predict if a newly submitted grant application proposes to use science that can be categorized as "Implementation Science" (a relatively new area of delineation). NHLBI uses the "IS score" in its decision for assigning the application to a particular division for routine grants management oversight and administration.Operation and MaintenanceDepartment of Health and Human Services
HHS-0112-2023HHSNIHNational Institutes of Health (NIH) NIAIDFederal IT Acquisition Reform Act (FITARA) ToolThe tool automates the identification of NIAID contracts that are IT-related.Operation and MaintenanceDepartment of Health and Human Services
HHS-0113-2023HHSNIHNational Institutes of Health (NIH) NIAIDDivision of Allergy, Immunology, and Transplantation (DAIT) AIDS-Related Research SolutionThe tool uses natural language processing (NLP), text extraction, and classification algorithms to predict both high/medium/low priority and area of research for a grant application. The incoming grant applications are ranked based on these predictions and more highly-ranked applications are prioritized for review.Operation and MaintenanceDepartment of Health and Human Services
HHS-0114-2023HHSNIHNational Institutes of Health (NIH) NIAIDScientific Research Data Management System Natural Language Processing Conflict of Interest ToolA tool that identifies entities within a grant application to allow NIAID's Scientific Review Program team to more easily identify conflicts of interest (COI) between grant reviewers and applicants using NLP methods (e.g., OCR, text extraction).Operation and MaintenanceDepartment of Health and Human Services
HHS-0115-2023HHSNIHNational Institutes of Health (NIH) NIAIDTuberculosis (TB) Case Browser Image Text DetectionA tool to detect text in images that could be potentially Personally Identifiable Information (PII)/ Protected Health Information (PHI) in TB Portals.Operation and MaintenanceDepartment of Health and Human Services
HHS-0116-2023HHSNIHNational Institutes of Health (NIH) NIAIDResearch Area Tracking ToolA dashboard that incorporates machine learning to help identify projects within certain high-priority research areas.Operation and MaintenanceDepartment of Health and Human Services
HHS-0117-2023HHSNIHNational Institutes of Health (NIH) NIDCRNIDCR Digital Transformation Initiative (DTI)An initiative to create a natural language processing chatbot to improve efficiency, transparency, and consistency for NIDCR employees.Development and AcquisitionDepartment of Health and Human Services
HHS-0118-2023HHSNIHNational Institutes of Health (NIH) NIDCRNIDCR Data BankThe project will permit intramural research program investigators to move large sets of unstructured data into a cloud archival storage, which will scale, provide cost effective data tiering, capture robust meta data sufficient for management and governance, and create secondary or tertiary opportunities for analysis leveraging cognitive services AI/ML/NLP toolsets.Development and AcquisitionDepartment of Health and Human Services
HHS-0119-2023HHSNIHNational Institutes of Health (NIH) NIEHSAutomated approaches for table extractionThis project developed an automated, model-based processes to reduce the time and level of effort for manual�extraction of data from tables. Published data tables are a particularly data-rich and challenging presentation of critical information in published research.Development and AcquisitionDepartment of Health and Human Services
HHS-0120-2023HHSNIHNational Institutes of Health (NIH) NIEHSSWIFT Active ScreenerApplies statistical models designed to save screeners time and effort through active learning. Utilize user feedback to automatically prioritize studies. Supports literature screening for Division of Translational Toxicology evidence evaluations.Operation and MaintenanceDepartment of Health and Human Services
HHS-0121-2023HHSNIHNational Institutes of Health (NIH) NIEHSSplunk IT System Monitoring SoftwareUtilizes machine learning to aggregate system logs from on-premises IT infrastructure systems and endpoints for auditing and cybersecurity monitoring purposes.Operation and MaintenanceDepartment of Health and Human Services
HHS-0122-2023HHSNIHNational Institutes of Health (NIH) NIGMSClinical Trial PredictorThe Clinical Trial Predictor uses an ensemble of several natural language processing and machine learning algorithms to predict whether applications may involve clinical trials based on the text of their titles, abstracts, narratives, specific aims, and research strategies.ImplementationDepartment of Health and Human Services
HHS-0123-2023HHSNIHNational Institutes of Health (NIH) NIGMSStem Cell Auto CoderThe Stem Cell Auto Coder uses natural language processing and machine learning to predict the Stem Cell Research subcategories of an application: human embryonic, non-human embryonic, human induced pluripotent, non-human induced pluripotent, human non-embryonic, and non-human non-embryonic.ImplementationDepartment of Health and Human Services
HHS-0124-2023HHSNIHNational Institutes of Health (NIH) NIGMSJIT Automated Calculator (JAC)The JIT Automated Calculator (JAC) uses natural language processing to parse Just-In-Time (JIT) Other Support forms and determine how much outside support PIs are receiving from sources other than the pending application.ImplementationDepartment of Health and Human Services
HHS-0125-2023HHSNIHNational Institutes of Health (NIH) NIGMSSimilarity-based Application and Investigator Matching (SAIM)The SAIM system uses natural language processing to identify non-NIH grants awarded to NIGMS Principal Investigators. The system aids in identifying whether a grant application has significant unnecessary overlap with one funded by another agency.Development and AcquisitionDepartment of Health and Human Services
HHS-0126-2023HHSNIHNational Institutes of Health (NIH) NLMRemediate Adobe .pdf documents to be more accessibleMany .pdf documents could be made available for public release if they conformed to Section 508 accessibility standards. NLM has been investigating the use of AI developed to remediate Adobe .pdf files not currently accessible to Section 508 standards.�The improved files are particularly more accessible to those like the blind who use assistive technology to read.Development and AcquisitionDepartment of Health and Human Services
HHS-0127-2023HHSNIHNational Institutes of Health (NIH) NLMCylanceProtectProtection of Windows and Mac endpoints from CyberthreatsOperation and MaintenanceDepartment of Health and Human Services
HHS-0128-2023HHSNIHNational Institutes of Health (NIH) NLMMEDIQA: Biomedical Question AnsweringUsing and developing AI approaches to automate question answering for different users. This project leverages NLM knowledge sources and traditional and neural machine learning to address a wide-range of biomedical information needs. This project aims for improving access with one-entry access point to NLM resources.InitiationDepartment of Health and Human Services
HHS-0129-2023HHSNIHNational Institutes of Health (NIH) NLMCLARIN: Detecting clinicians' attitudes through clinical notesUnderstanding clinical notes and detecting bias is essential in supporting equity and diversity, as well as quality of care and decision support. NLM is using and developing AI approaches to detect clinicians' emotions, biases and burnout.Development and AcquisitionDepartment of Health and Human Services
HHS-0130-2023HHSNIHNational Institutes of Health (NIH) NLMBest Match: New relevance search for PubMedPubMed is a free search engine for biomedical literature accessed by millions of users from around the world each day. With the rapid growth of biomedical literature, finding and retrieving the most relevant papers for a given query is increasingly challenging. NLM developed Best Match, a new relevance search algorithm for PubMed that leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date sort order.Operation and MaintenanceDepartment of Health and Human Services
HHS-0131-2023HHSNIHNational Institutes of Health (NIH) NLMSingleCite: Improving single citation search in PubMedA search that is targeted at finding a specific document in databases is called a Single Citation search, which is particularly important for scholarly databases, such as PubMed, because it is a typical information need of the users. NLM developed SingleCite, an automated algorithm that establishes a query-document mapping by building a regression function to predict the probability of a retrieved document being the target based on three variables: the score of the highest scoring retrieved document, the difference in score between the two top retrieved documents, and the fraction of a query matched by the candidate citation. SingleCite shows superior performance in benchmarking experiments and is applied to rescue queries that would fail otherwise.Operation and MaintenanceDepartment of Health and Human Services
HHS-0132-2023HHSNIHNational Institutes of Health (NIH) NLMComputed Author: author name disambiguation for PubMedPubMed users frequently use author names in queries for retrieving scientific literature. However, author name ambiguity (different authors share the same name) may lead to irrelevant retrieval results. NLM developed a machine-learning method to score the features for disambiguating a pair of papers with ambiguous names. Subsequently, agglomerative clustering is employed to collect all papers belong to the same authors from those classified pairs. Disambiguation performance is evaluated with manual verification of random samples of pairs from clustering results, with a higher accuracy than other state-of-the-art methods. It has been integrated into PubMed to facilitate author name searches.Operation and MaintenanceDepartment of Health and Human Services
HHS-0133-2023HHSNIHNational Institutes of Health (NIH) NLMNLM-Gene: towards automatic gene indexing in PubMed articlesGene indexing is part of the NLM's MEDLINE citation indexing efforts for improving literature retrieval and information access. Currently, gene indexing is performed manually by expert indexers. To assist this time-consuming and resource-intensive process, NLM developed NLM-Gene, an automatic tool for finding gene names in the biomedical literature using advanced natural language processing and deep learning methods. Its performance has been assessed on gold-standard evaluation datasets and is to be integrated into the production MEDLINE indexing pipeline.InitiationDepartment of Health and Human Services
HHS-0134-2023HHSNIHNational Institutes of Health (NIH) NLMNLM-Chem: towards automatic chemical indexing in PubMed articlesChemical indexing is part of the NLM's MEDLINE citation indexing efforts for improving literature retrieval and information access. Currently, chemicals indexing is performed manually by expert indexers. To assist this time-consuming and resource-intensive process, NLM developed NLM-Chem, an automatic tool for finding chemical names in the biomedical literature using advanced natural language processing and deep learning methods. Its performance has been assessed on gold-standard evaluation datasets and is to be integrated into the production MEDLINE indexing pipeline.InitiationDepartment of Health and Human Services
HHS-0135-2023HHSNIHNational Institutes of Health (NIH) NLMBiomedical Citation Selector (BmCS)Automation of article selection allows NLM to more efficiently and effectively index and host relevant information for the public. Through automation, NLM is able standardize article selection and reduce the amount of time it takes to process MEDLINE articles.ImplementationDepartment of Health and Human Services
HHS-0136-2023HHSNIHNational Institutes of Health (NIH) NLMMTIXMachine learning-based system for the automated indexing of MEDLINE articles with Medical Subject Headings (MeSH) terms. Automated indexing is achieved using a multi-stage neural text ranking approach. Automated indexing allows for cost-effective and timely indexing of MEDLINE articles.ImplementationDepartment of Health and Human Services
HHS-0137-2023HHSNIHNational Institutes of Health (NIH) NLMClinicalTrials.gov Protocol Registration and Results System Review AssistantThis research project aims to help ClinicalTrials.gov determine whether the addition of AI could make reviewing study records more efficient and effective.Development and AcquisitionDepartment of Health and Human Services
HHS-0138-2023HHSNIHNational Institutes of Health (NIH) NLMMetaMapMetaMap is a widely available program providing access from biomedical text to the concepts in the unified medical language system (UMLS) Metathesaurus. MetaMap uses NLP to provide a link between the text of biomedical literature and the knowledge, including synonymy relationships, embedded in the Metathesaurus. The flexible architecture in which to explore mapping strategies and their application are made available. MTI uses the MetaMap to generate potential indexing terms.Operation and MaintenanceDepartment of Health and Human Services
HHS-0139-2023HHSNIHNational Institutes of Health (NIH) NLMPangolin lineage classification of SARS-CoV-2 genome sequencesThe PangoLEARN machine learning tool provides lineage classification of SARS-CoV-2 genome sequences. Classification of SARS-CoV-2 genome sequences into defined lineages supports user retrieval of sequences based on classification and tracking of specific lineages, including those lineages associated with mutations that may decrease the effectiveness of therapeutics or protection provided by vaccination.Operation and MaintenanceDepartment of Health and Human Services
HHS-0140-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OARHIV-related grant classifier toolA front-end application for scientific staff to input grant information which then runs an automated algorithm to classify HIV-related grants. Additional features and technologies used include an interactive data visualization, such as a heat map, using Plotly Python library to display the confidence level of predicted grants.ImplementationDepartment of Health and Human Services
HHS-0141-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OPAAutomated approaches to analyzing scientific topicsDeveloped and implemented a validated approach that uses natural language processing and AI/ML to group semantically similar documents (including grants, publications, or patents) and extract AI labels that accurately reflect the scientific focus of each topic to aid in NIH research portfolio analysis.ImplementationDepartment of Health and Human Services
HHS-0142-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OPAIdentification of emerging areasDeveloped an AI/ML-based approach that computes the age and rate of progress of topics in NIH portfolios. This information can identify emerging areas of research at scale and help accelerate scientific progress.ImplementationDepartment of Health and Human Services
HHS-0143-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OPAPerson-level disambiguation for PubMed authors and NIH grant applicantsCorrect attribution of grants, articles, and other products to individual researchers is critical for high quality person-level analysis. This improved method for disambiguation of authors on articles in PubMed and NIH grant applicants can inform data-driven decision makingImplementationDepartment of Health and Human Services
HHS-0144-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OPAPrediction of transformative breakthroughsThe ability to predict scientific breakthroughs at scale would accelerate the pace of discovery and improve the efficiency of research investments. The initiative has helped identify a common signature within co-citation networks that accurately predicts the occurrence of breakthroughs in biomedicine, on average more than 5 years in advance of the subsequent publication(s) that announced the discovery.�There is a patent application filed for this approach: U.S. Patent Application No. 63/257,818 (filed October 20, 2021)ImplementationDepartment of Health and Human Services
HHS-0145-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OPAMachine learning pipeline for mining citations from full-text scientific articlesThe NIH Office of Portfolio Analysis developed a machine learning pipeline to identify scientific articles that are freely available on the internet and do not require an institutional library subscription to access. The pipeline harvests full-text pdfs, converts them to xml, and uses a Long Short-Term Memory (LSTM) recurrent neural network model that discriminates between reference text and other text in the scientific article. The LSTM-identified references are then passed through our Citation Resolution Service. For more information see the publication describing this pipeline: Hutchins et al 2019 (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000385#sec003).Operation and MaintenanceDepartment of Health and Human Services
HHS-0146-2023HHSNIHNational Institutes of Health (NIH) OD/DPCPSI/OPAMachine learning system to predict translational progress in biomedical researchA machine learning system that detects whether a research paper is likely to be cited by a future clinical trial or guideline. Translational progress in biomedicine can therefore be assessed and predicted in real time based on information conveyed by the scientific community's early reaction to a paper. For more information see the publication describing this system: Hutchins et al 2019 (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000416)Operation and MaintenanceDepartment of Health and Human Services
HHS-0147-2023HHSNIHNational Institutes of Health (NIH) OD/OERResearch, Condition, and Disease Categorization (RCDC) AI Validation ToolThe goal of the tool is to ensure RCDC categories are accurate and complete for public reporting of data.Development and AcquisitionDepartment of Health and Human Services
HHS-0148-2023HHSNIHNational Institutes of Health (NIH) OD/OERInternal Referral Module (IRM)The IRM initiative automates a manual process by using Artificial Intelligence & Natural Language Processing capabilities to help predict grant applications to NIH Institutes and Centers (ICs) Program Officers to make informed decisions.ImplementationDepartment of Health and Human Services
HHS-0149-2023HHSNIHNational Institutes of Health (NIH) OD/OERNIH Grants Virtual AssistantChat Bot to assist users in finding grant related information via OER resourcesOperation and MaintenanceDepartment of Health and Human Services
HHS-0150-2023HHSNIHNational Institutes of Health (NIH) OD/ORFTool for Nature Gas Procurement PlanningWith this tool, NIH can establish a natural gas procurement plan and set realistic price targets based on current long-term forecasts.ImplementationDepartment of Health and Human Services
HHS-0151-2023HHSNIHNational Institutes of Health (NIH) OD/ORFNIH Campus Cooling Load ForecasterThis project forecasts the NIH campus's chilled water demand for the next four days. With this information, the NIH Central Utilities Plant management can plan and optimize the chiller plant's operation and maintenance.Operation and MaintenanceDepartment of Health and Human Services
HHS-0152-2023HHSNIHNational Institutes of Health (NIH) OD/ORFNIH Campus Steam Demand ForecasterThis project forecasts the NIH campus steam demand for the next four days. With this information, the stakeholders at the NIH Central Utilities Plant can plan and optimize the plant operation and maintenance in advance.Operation and MaintenanceDepartment of Health and Human Services
HHS-0153-2023HHSNIHNational Institutes of Health (NIH) OD/ORFChiller Plant OptimizationThis project will help to reduce the energy usage for producing chilled water to cool the NIH campus.Development and AcquisitionDepartment of Health and Human Services
HHS-0154-2023HHSNIHNational Institutes of Health (NIH) OD/ORFNatural Language Processing Tool for Open Text AnalysisThis project will improve facility readiness and reduce downtime by allowing other software to analyze data that was locked away in open text.Development and AcquisitionDepartment of Health and Human Services
HHS-0155-2023HHSOIGOIGContracts and Grants Analytics PortalThe Contracts and Grants Analytics Portal uses AI to enhance HHS OIG staff's ability to access grants related data quickly and easily by: quickly navigating directly to the text of relevant findings across thousands of audits, the ability to discover similar findings, analyze trends, compare data between OPDIVs, and the means to see preliminary assessments of potential anomalies between grantees.Operation and MaintenanceDepartment of Health and Human Services
HHS-0156-2023HHSOIGOIGText Analytics PortalThe text analytics portal allows personnel without an analytics background to quickly examine text documents through a related set of search, topic modeling and entity recognition technologies; Initial implementation's focus is on HHS-OIG specific use cases.ImplementationDepartment of Health and Human Services
HUD-0000-2023HUDConsolidated Plan Pilot AnalysisIn March 2023, PD&R began a pilot project to analyze aspects of HUD's Consolidated Plans. HUD requires grantees of its formula block grant programs to submit Consolidated Plans, which are meant to identify and assess affordable housing and community development needs and market conditions. These plans are publicly available via HUD's website. HUD staff currently review these plans for compliance, but HUD lacks the capacity to do in-depth analysis of commonalities or trends contained within plans. This pilot project will explore creating a database and chat-bot that will enable HUD staff to query features of the nearly 1,000 active Consolidated Plans. This pilot exercise has the potential to inform grantees, technical assistance, and other programmatic tweaks, as well as inform how advanced data science tools can benefit our programs and operations.Department of Housing and Urban Development
NARA-0000-2023NARAInformation ServiceAI Pilot Project to Screen and Flag for Personally Identifiable Information (PII) in Digitized Archival RecordsThe NARA Information Service (I) team collaborating with the Office of Innovation (V), Research Services (R), and the Office of General Counsel (NGC) on a pilot project to use artificial intelligence (AI) tools available on the Amazon Web Services Platform (AWS) and/or Google Cloud Platform to identify and redact Social Security Numbers, Dates of Birth, and other personally identifiable information (PII) in digitized archival records.
This pilot project will screen digitized pages already in the National Archives Catalog (NAC or the Catalog) and both internal NARA and external partner digitized pages that are in the queue to be added to the Catalog. PII Detection pilot can detect PII information in all documents and parent document groups that have National Archives Identifier (NAID) associated with them and are accessible via NARA Catalog API. PII Detection pilot uses a weighted scoring algorithm to rank the documents with most sensitive information (defined by agency needs) with higher scores.
Agency further plans to enhance this prototype into a User Interface driven tool that can be used by the LegalBusiness and Security team to run preliminary scans on unpublished Information. Agency also plans to enhance to this prototype by further adding custom entities for detection."Planned (not in production)1. Textraction Machine Learning (ML) service which used OCR to extract the text/data from scanned images 2. Automated NLP (Natural Language processing) to detect PII information out of the extracted text from scanned images.National Archives and Records Administration
NARA-0001-2023NARAInformation ServiceAuto-fill of Descriptive Metadata for Archival DescriptionsArchival descriptions or Self-describing records is the process of filling out descriptive metadata for the records that will be released to the public. When records are released to the public, the records need to be described with summary, authorities and other fields that would allow the records to be surfaced during a search. NARA has released millions of pages of records to the public via the National Archives Catalog at catalog.archives.gov. Most of the records have very minimal set of descriptive metadata as it is a very intensive manual process. The self-describing records will look at the content of the document and the various available metadata from the records management system such as originating agency etc. and predict values for the descriptive metadata.Planned (not in production)Standard Machine learning to predict values for descriptive metadata fields given various inputs such as the content and metadata from the records management system.National Archives and Records Administration
NARA-0002-2023NARAInformation ServiceAutomated Data Discovery and Classification PilotNARA is planning to conduct a future pilot to test AI/ML based automated data discovery and classification using public/mock-up datasets. In this pilot we will also test both supervised and un-supervised AI/ML techniques.
We’re planning to use vendor’s COTS solution/ML algorithms ""document classifier"" which allows the customer to search and discover full documents rather than individual sensitive data elements such as SSN or credit card numbers. This technique allows for the ""finding"" of discovery to be a document. Our customers can search for and discover all their RFPspurchase ordersNDAsfinancial statementsbudget documentsresumesetc. In cases where NARA has a document type in mind that vendor’s COTS solution does not already understandwe should be able to assemble a learning set of documents (typically 20 - 100 examples) and train the vendor algorithm to find all documents of that type."Planned (not in production)Document/File Classification: Document/file classification is a supervised ML algorithm that classifies whole documents according to their type. The algorithm works by converting each document to a term frequency–inverse document frequency (tf-idf) numerical representation and passing these vectors through a multi-layer neural network to finally get the document’s type/class. Document/File Clustering: Document/file clustering is an unsupervised ML algorithm that groups similar files together according to their content. For example, non-disclosure agreements will cluster together while product presentation files will be assigned to a different cluster.National Archives and Records Administration
NARA-0003-2023NARAInformation ServiceSemantic Search for National Archives Catalog - an Artificial Intelligence (AI) / Machine Learning (ML) Pilot ProgramThe National Archives and Records Administration (NARA) is responsible for preserving and
providing access to the records of the United States federal government. The NARA Catalog
contains millions of records and documents critical to researchershistoriansand the general
public and finding the correct records or documents can be a time-consuming and challenging task.
Semantic search can solve this problem by allowing users to search the catalog using natural language queries. Semantic search is a data searching technique that not only finds the matching keywords based on user search terms but also understands the user’s intent and contextual meaning behind the search terms. Semantic search can also help to improve the accuracy and relevance of search results. By analyzing the meaning and context of search queriessemantic search can provide more accurate and relevant results than traditional keyword-based search methods. This can help researchers and historians to find the records and documents they need more quickly and easily.
Additionallysemantic search can help to identify the relationships between records and documents in the NARA catalog. This can help to provide a more comprehensive understanding of the historical events and processes represented in the records and can facilitate new insights and discoveries."Planned (not in production)National Archives and Records Administration
NARA-0004-2023NARANGC FOIA OfficeFreedom of Information Act (FOIA) Discovery AI PilotNARA would like to utilize various AI techniques to respond to FOIA requests. The AI system will do two things. First, the AI system would offer an NLP based search technique based on the content similarity between the query and the content of the records. The second main AI application would be to redact based on the nature of the FOIA request. Most of the time, the personal information is redacted, but additional information would also need to be redacted based on the requester.Planned (not in production)AI based Vector search or Content similarity search.National Archives and Records Administration
NASA-0000-2023NASAAmes Research CenterAdaStressTesting complex systems often requires computationally intensive Monte Carlo sampling approaches to identify
possible faults. In systems where the number of faults are lowbut safety criticalthis form of testing may be
infeasible due to the large number of samples needed to catch a rare fault. AdaStress instead uses reinforcement
learning to more efficiently sample low-likelihoodbut high-impact faults."In-useReinforcement LearningNational Aeronautics and Space Administration
NASA-0001-2023NASAAmes Research CenterBiological and Physical Sciences (BPS) RNA Sequencing Benchmark Training DatasetRNA sequencing data from spaceflown and control mouse liver samples, sourced from NASA GeneLab and
augmented with generative adversarial network to provide synthetic data points. The implementation uses
classification methods and hierarchical clustering to identify genes that are predictive of outcomes."In-useGANs, Hierarchical Clusteringhttps://github.com/NASA-IMPACT/bps-numericalNational Aeronautics and Space Administration
NASA-0002-2023NASAAmes Research CenterBiological and Physical Sciences Microscopy Benchmark DatasetThis study uses fluorescence microscopy images from the Biological and Physical Sciences Open Science Data
Repositories (osdr.nasa.gov). The dataset consists of 93488 images of individual nuclei from mouse fibroblast cells
irradiated with Fe particles or X-rays and labeled for DNA double strand breaks using 53BP1 as a fluorescence
marker. DNA damage appears as small white foci in these images. The study simulates exposure to space radiation
and the dataset has been modified to be AI ready so that AI expert can test several AI tools on them. The dataset is
publicly available on the Registry of Open Data on AWS. Implementation AI tools developed in-house are also
available on the link."In-useGraphical Neural Networkhttps://github.com/NASA-IMPACT/bps-imagery-radiation-classification/tree/cnn_classifierNational Aeronautics and Space Administration
NASA-0003-2023NASAAmes Research CenterHigh-Performance Quantum-Classical Hybrid Deep Generative Modeling Parameterized by Energy-based Models for Flight-Operations Anomaly DetectionOur project conducts high-performance scalable and explainable machine learning for flight-operations anomaly
detectionwith contributions from classical computing (enhanced performancereduced cost) and quantum
computing (encoding of quantum correlationsquantum-resource estimates). Our deep-learning model takes time
series of 19 flight metrics collected by flight recorders of commercial aircraft as input and predicts operational and
safety-relevant anomalies during the take-off and landing phases of flight."In-useConvolutional Neural Network, #K-Means Clustering, Variational AutoencodersNational Aeronautics and Space Administration
NASA-0004-2023NASAAmes Research CenterPrediction of Mass Level in Radio Frequency CryogenicsUtilizing the Radio frequency signature of fluids in a tank, the ML model predicts the level of fluid in the tank. In
micro-gravity standard fluid level detection methods do not work because the fluid is not restricted to any shape or
Definition."In-useNational Aeronautics and Space Administration
NASA-0005-2023NASAAmes Research CenterPre-trained microscopy image neural network EncodersConvolutional Neural Network encoders were trained on over 100,000 microscopy images of materials. When
deployed in downstream microscopy tasks through transfer learningencoders pre-trained on MicroNet outperform
ImageNet encoders. These pre-trained MicroNet encoders have been successfully deployed for semantic
segmentationinstance segmentationand regression tasks. Current work is ongoing to deploy the encoders for
generative tasks and 3D texture synthesis tasks. The technology has been used to quantify the microstructure of
numerous materials including SLS core stage weldsNi-based superalloyscompositesand oxide dispersion
strengthened alloys. Establishing the relationship between processing (how a material is made)microstructure (the
atomisitc and phase arrangement of a material)and properties of materials is fundemental to the design and
development of new materials. Microstructure is often analyzed qualitatively or by tedious manual measurements.
This technology enables and improves the rapid quantification of material microstructure from microscope images
for use in data-driven approaches to design materials faster."In-useTransfer learninghttps://github.com/nasa/pretrained-microscopy-modelsNational Aeronautics and Space Administration
NASA-0006-2023NASAGlenn Research CenterApplication that provides bio-inspired solutions to engineering problems (PeTaL)PeTaL (the Periodic Table of Life) is an open source artificial intelligence (AI) design tool that leverages data and
information from nature and technology to advance biomimicry research and development. PeTaL is envisioned to
streamline various steps of the bio-inspired design process by integrating new and existing tools and methodologies
around its core ontological framework (Shyam et al.2019; Unsworth et al.2019). To be as comprehensive as
possiblePeTaL requires mass curation of standardized data through which it can learninterpretand output
predictive solutions to design queries. PeTaL is intended to be used by designers and engineers who seek nature’s
solutions to their design and engineering problemsas well as by biologists who seek to extend the application of
their scientific discoveries.
In Production: Classification of biology journal articles into functional categories.
In Development: Joint text summarization and named entity recognition task involving open-access biology journal
articles using large language models such as those available from OpenAI."In-useLLM prompt engineering, BERT text classification, Natural Language Processinghttps://github.com/nasa-petalNational Aeronautics and Space Administration
NASA-0007-2023NASAGlenn Research CenterInverse Design of MaterialsDiscovering new materials is typically a mix of art and science, with timelines to create and robustly test a new
material mix / manufacturing method ranging from ten to twenty years. This project seeks to enable rapid
discoveryoptimizationqualifaction and deployment of fit-for-purpose materials. Supervised ML models are
trained to establish the relationship between how a material is made and how the material performs. Then Bayesian
optimization is used to select iterative optimal experiments to achieve the target material properties in a cost and
time efficient manner compared to traditional design of experiments. The project is currently being utilized in an
NESC investigation to improve SLS core stage weld quality. The technology will be used to select experiments for a
fully autonomous robotic lab that is currently being procured to design better insulating materials for electrified
aircraft. Outputs include recipes and approaches for new materials custom-tailored to applications with an 4x
speedup for the overall materials discovery / design lifecycleand potential 10x throughput for the same cycle based
on parallizing discovery of multiple materials at once."In-useNational Aeronautics and Space Administration
NASA-0008-2023NASAGoddard Space Flight CenterEuropa Ice Floe Detection (GSFC Planetary Sciences Lab)Machine Learning applied to Galileo space probe imagery to detect and classify ice blocks in the chaos regions of
Jupiter's moon Europa. GANs were also used to generate simulated training data."In-useMask R-CNN, GANshttps://gitlab.grc.nasa.gov/kgansler/europa-ice-floe-detectionNational Aeronautics and Space Administration
NASA-0009-2023NASAGoddard Space Flight CenterTitan Methane Cloud Detection (GSFC Planetary Sciences Lab)Machine Learning applied to Cassini space probe imagery to detect and characterize methane clouds on Saturn's
moon Titan."In-useMask R-CNN, U-net image Recognitionhttps://gitlab.grc.nasa.gov/zyahn/titan-clouds-projectNational Aeronautics and Space Administration
NASA-0010-2023NASAJet Propulsion LaboratoryASPEN Mission PlannerBased on AI techniques, ASPEN is a modular, reconfigurable application framework which is capable of supporting a
wide variety of planning and scheduling applications. ASPEN provides a set of reusable software components that
implement the elements commonly found in complex planning/scheduling systemsincluding: an expressive
modeling languagea resource management systema temporal reasoning systemand a graphical interface. ASPEN
has been used for many space missions including: Modified Antarctic Mapping MissionOrbital ExpressEarth
Observing Oneand ESA's Rosetta Orbitter."In missionconstraint-based heuristic SearchNational Aeronautics and Space Administration
NASA-0011-2023NASAJet Propulsion LaboratoryAutonomous Marine Vehicles (Single, Multiple)Due to the communication paradigm associated with operating an underwater submersible on an Ocean World, the
vehicle must be able to act autonomously when achieving scientific goals. One such goal is the study of
hydrothermal venting. Evidence for hydrothermal activity has been found on one Ocean WorldEnceladus. On
Earththese geological phenomena harbor unique ecosystems and are potentially critical to the origin of life. Similar
vents on Ocean Worlds could be the best chance at extra-terrestrial life in our Solar System. We focus on performing
autonomous sciencespecifically the localization of features of interest - such as hydrothermal venting - with limited
to no human interaction. A field program to Karasik Seamount in the Arctic Ocean was completed in Fall 2016 to
study and understand the human-in-the-loop approach to the localizing hydrothermal venting. In 2017/2018 an
autonomous nested search method for hydrothermal venting was developed and tested in simulation using a
hydrothermal plume dispersion model developed by Woods Hole Oceanographic Institution. Numerous
deployments have been executed including to Monterey Bar (multiple)Chesapeake Bay."In missionconstraint-based heuristic SearchNational Aeronautics and Space Administration
NASA-0012-2023NASAJet Propulsion LaboratoryCLASP Coverage Planning & SchedulingThe Compressed Large-scale Activity Scheduling and Planning (CLASP) project is a long-range scheduler for space-
based or aerial instruments that can be modelled as pushbrooms -- 1D line sensors dragged across the surface of
the body being observed. It addresses the problem of choosing the orientation and on/off times of a pushbroom
instrument or collection of pushbroom instruments such that the schedule covers as many target points as possible
but without oversubscribing memory and energy. Orientation and time of observation is derived from geometric
computations that CLASP performs using the SPICE ephemeris toolkit. CLASP allows mission planning teams to start
with a baseline mission concept and simulate the mission's science return using models of science observations
spacecraft operationsdownlinkand spacecraft trajectory. This analysis can then be folded back into many aspects
of mission design -- including trajectoryspacecraft designoperations conceptand downlink concept. The long
planning horizons allow this analysis to span an entire mission. Actively in use for optimized scheduling for the
NISAR MissionECOSTRESS mission (study of water needs for plant areas)EMIT mission (minerology of arid dusty
regions)OCO-3 (atmospheric CO2) and more as well as used for numerous missions analysis and studies (e.g.
100+)."In missionconstraint-based heuristic SearchNational Aeronautics and Space Administration
NASA-0013-2023NASAJet Propulsion LaboratoryHybrid On-Board and Ground-Based Processing of Massive Sensor Data (HyspIRI IPM)Future space missions will enable unprecedented monitoring of the Earth's environment and will generate immense
volumes of science data. Getting this data to ground communications stationsthrough science processingand
delivered to end users is a tremendous challenge. On the groundthe spacecraft's orbit is projectedand automated
mission-planning tools determine which onboard-processing mode the spacecraft should use. The orbit determines
the type of terrain that the spacecraft would be overflying—landicecoastor oceanfor instance. Each terrain
mask implies a set of requested modes and priorities. For examplewhen a spacecraft overflies polar or
mountainous regionsproducing snow and ice coverage maps can provide valuable science data. The science team
can adjust these priorities on the basis of additional information (such as external knowledge of an active volcanoa
flooded areaan active wildfireor a harmful algal bloom). The mission-planning tool accepts all these requests and
prioritiesthen determines which onboard-processing algorithms will be active by selecting the highest-priority
requests that fit within the onboard CPU resourcesband-processing limitationsand downlink bandwidth.In the
intelligent onboard processing conceptHyspIRI's onboard processing algorithms would consist of expert-derived
decision tree classifiersmachine-learned classifiers such as SVM classifiers and regressionsclassification and
regression trees (CART)Bayesian maximum-likelihood classifiersspectral angle mappersand direct
implementations of spectral band indices and science products"In missionconstraint-based heuristic SearchNational Aeronautics and Space Administration
NASA-0014-2023NASAJet Propulsion LaboratoryMexec Onboard Planning and ExecutionMEXEC is a lightweight, multi-mission software for activity scheduling and execution developed to increase the
autonomy and efficiency of a robotic explorer. MEXEC was first created as a prototype demonstration for the
Europa Clipper project as a potential solution to fail-operational requirements. Specificallythe Europa project is
concerned with the radiation environment around Jupiter which can trigger on-board computer resets at critical
times of the mission (e.g. during Europa flybys). If a CPU reset occursflight software must bring the spacecraft back
to a safe state and resume science operations as quickly as possible to minimize science loss. The MEXEC prototype
flight software was developed to provide such a capability using proven AI planningschedulingand execution
technologies. Instead of command sequencesMEXEC works with task networkswhich include abstract
representations of command behaviorconstraints on timingand resources required and/or consumed by the
behavior. Using this knowledge on-boardMEXEC can monitor command behavior and react to off-nominal
outcomes (e.g. CPU reset)reconstructing command sequences to continue spacecraft operations without
jeopardizing spacecraft safety."In-useNational Aeronautics and Space Administration
NASA-0015-2023NASAJet Propulsion LaboratoryOnboard Planner for Mars2020 Rover (Perseverance)The M2020 onboard scheduler incrementally constructs a feasible schedule by iterating through activities in priority-
first order. When considering each activity it computes the valid time intervals for placementtaking into account
preheatingmaintenance heatingand wake/sleep of the rover as required. After an activity is placed (other than a
preheat/maintenance or wake/sleep)the activity is never reconsidered by the scheduler for deletion or moving.
Therefore the scheduler can be considered non backtrackingand only searches in the sense that it computes valid
timeline intervals for legal activity placement. Meta Search: Because the onboard scheduler will be invoked many
times in a given sol (Martian Day) with a range of possible contexts (due to execution variations)its non
backtracking nature leaves its vulnerable to brittleness. In order to mitigate this potential brittlenessthe Copilot
systems perform a monte carlo based stochastic analysis to set meta parameters of the scheduler - primarily activity
priority but also potentially preferred time and temporal constraints. Also: Researchexperimentsand
engineering to empower future rovers with onboard autonomy; planningscheduling & execution; path planning;
onboard science; image processing; terrain classification; fault diagnosis; and location estimation. This is a multi-
faceted effort and includes experimentation and demonstrations on-site at JPL's simulated mars navigation yard."In missionNational Aeronautics and Space Administration
NASA-0016-2023NASAJet Propulsion LaboratorySensorWeb: Volcano, Flood, Wildfire, and Others.The Sensor Web Project uses a network of sensors linked by software and the internet to an autonomous satellite
observation response capability. This system of systems is designed with a flexiblemodulararchitecture to
facilitate expansion in sensorscustomization of trigger conditionsand customization of responses. This system has
been used to implement a global surveillance program to study volcanos. We have also run sensorweb tests to
study floodingcryosphere eventsand atmospheric phenomena. Specificallyin our applicationwe use low
resolutionhigh coverage sensors to trigger observations by high resolution instruments. Note that there are many
other rationales to network sensors into a sensorweb. For example automated response might enable observation
using complementary instruments such as imaging radarinfra-redvisibleetc. Or automated response might be
used to apply more assets to increase the frequency of observation to improve the temporal resolution of available
data. Our sensorweb project is being used to monitor the Earth's 50 most active volcanos. We have also run
sensorweb experiments to monitor floodingwildfiresand cryospheric events (snowfall and meltlake freezing and
thawingsea ice formation and breakup.)"In missionconstraint-based heuristic SearchNational Aeronautics and Space Administration
NASA-0017-2023NASAJet Propulsion LaboratoryTRN (Terrain Relative Navigation)Terrain Relative Navigation (TRN) estimates position during Mars landing by automatically matching landmarks
identified in descent images to a map generated from orbital imagery. The position estimate is used to a select a
safe and reachable landing site in a region with many large hazards. TRN was used successfully by the Mars 2020
mission during its landing on February 18th2021 and will be used on Mars Sample Return Lander."In-useComputer vision and state Estimation.National Aeronautics and Space Administration
NASA-0018-2023NASALangley Research CenterAutonomous WAiting Room Evaluation (AWARE)Using an existing security camera and YOLO Machine Learning model to detect and count number of people waiting
for service at Langley's Badge & Pass Office. When a predetermined threshold of people is exceededautomated
texts and emails are sent to request additional help at the service counters."In-useConvolutional Neural NetworkNational Aeronautics and Space Administration
NASA-0019-2023NASALangley Research CenterGeophysical Observations Toolkit for Evaluating Coral Health (GOTECH)Three capstone projects conducted 2021-2022 with Georgia Tech and University of Rochester to develop machine
learning models that can analyze satellite LIDAR imagery to detect coral reefs and monitor their health. Capstones
were conducted with support of Coral Vita (an NGO) and the National Institute of Aerospace. Results were
presented at United Nations COP27."In-usesupport vector machine, artificial neural networkhttps://ntrs.nasa.gov/citations/20220010955National Aeronautics and Space Administration
NASA-0020-2023NASALangley Research CenterLessons Learned Bot (LLB)In near real-time, the Lessons Learned Bot, or LLB, brings lessons learned (LL) documents to users through a
Microsoft Excel add-in application locally installed to search for LL content relevant to the text within the selected
Excel cell. The application will encompass a corpus of documentsa trained Machine Learning (ML) modelbuilt-in
ML tools to train user’s documentsand an easy-to-use user interface to allow for the streamlined discovery of LL
content. TodayNASA’s LL are online and searchable via keywords. Neverthelessusers often face a challenge to find
lessons relevant to their issues. Applying the advancement in Natural Language Processing (NLP) ML algorithmthe
LLB can find and rank LL records relevant to text in the user’s selected Excel cellscontaining just a few words or
entire paragraphs of text. Results are displayed to the user in their existing Excel workflow. The LLB’s installation
package comes with a pre-trained NASA LL dataset and a NASA Scientific and Technical Information (STI) datasetas
well as on-demand training tools allowing the user to apply the LLB search algorithm to their own discipline specific
datasets.Additionallywe also have an API version of this software that can be called from any application within the
Agency firewall."In-useNational Aeronautics and Space Administration
NASA-0021-2023NASALangley Research CenterPedestrian Safety Corridors for Drone Test RangeNASA Langley Research Center (LaRC) is actively experimenting with Unmanned Aerial Systems (UAS - Drones and
surrounding systems) to include commandcontrolcoordination and safety mechanisms. LaRC is expanding an on-
site UAS test rangeto include areas where people walkdriveetc. This project leverages the parking advisor image
recognition project and applies it to detecting pedestrian traffic to supplement statistical assessment of human-
heavy and human lite traffic areas with near-real time human-presence-detection. Inputs include camera signals
and hand labelled training data. Outputs include maps indicating density of human pedestrian traffic. The results
have been embedded into the GRASP flight risk simulation tool."In-usehttps://gitlab.grc.nasa.gov/dmtrent/wahldo-1National Aeronautics and Space Administration
NASA-0022-2023NASAMarshall Space Flight CenterAirplane detectionDeep learning-based airplane detection from high-resolution satellite imageryIn-useNational Aeronautics and Space Administration
NASA-0023-2023NASAMarshall Space Flight CenterAutomatic Detection of Impervious Surfaces from Remotely Sensed Data Using Deep LearningUses a U-Net based architecture with VGG-19 as an encoder block and custom decoder block to map the impervious
surfaces using Landsat and OSM data patches"In-useNational Aeronautics and Space Administration
NASA-0024-2023NASAMarshall Space Flight CenterDeep Learning Approaches for mapping surface water using Sentinel-1Uses a U-Net based architecture to map surface water using the Sentinel-1 SAR ImagesIn-useNational Aeronautics and Space Administration
NASA-0025-2023NASAMarshall Space Flight CenterDeep Learning-based Hurricane Intensity EstimatorA web-based situational awareness tool that uses deep learning on satellite images to objectively estimate
windspeed of a hurricane"In-useNational Aeronautics and Space Administration
NASA-0026-2023NASAMarshall Space Flight CenterForecasting Algal Blooms With Ai In Lake AtitlánDeep analyses on image datasets from different satellites. Machine learning will help to identify the variables that
could predict future algal blooms. Knowledge on what those triggers are can turn into precise preventative action
not just in Lake Atitlanbut also in other freshwater bodies with similar conditions in Central and South America."In-useNational Aeronautics and Space Administration
NASA-0027-2023NASAMarshall Space Flight CenterGCMD Keyword Recommender (GKR)Natural Language Processing-based science keyword suggestion toolIn-useNatural Language ProcessingNational Aeronautics and Space Administration
NASA-0028-2023NASAMarshall Space Flight CenterImageLabelerWeb-based Collaborative Machine Learning Training Data Generation ToolIn-useNational Aeronautics and Space Administration
NASA-0029-2023NASAMarshall Space Flight CenterMapping sugarcane in Thailand using transfer learning, a lightweight convolutional neural network, NICFI high resolution satellite imagery and Google Earth EngineUses a U-Net based architecture with MobileNetV2 based encoder with transfer learning from global model to map
the sugarcane pixels in Thailand. This uses NICFI mosaic for the training purpose."In-useNational Aeronautics and Space Administration
NASA-0030-2023NASAMarshall Space Flight CenterPredicting streamflow with deep learningUses a long short-term memory model to predict streamflow at USGS gauges sites with inputs from the NASA Land
Information System and forecasts of precipitation"In-useNational Aeronautics and Space Administration
NASA-0031-2023NASAMarshall Space Flight CenterShip detectionDeep learning-based ship detection from high-resolution satellite imageryIn-useNational Aeronautics and Space Administration
NASA-0032-2023NASAMarshall Space Flight CenterSimilarity Search for Earth Science Image ArchiveSelf Supervised Based Learning approach to search image archives using a query imageIn-useNational Aeronautics and Space Administration
OPM-0000-2023OPMHRS/FSC/ASMG & OCIO/FITBSHuman Resource Apprentice (HRA)Evaluate the technical feasibility, validity, and affordability of providing AI-supported applicant review help to HR Specialists in USA Staffing. OPM will also evaluate prototype against fairness and bias standards to ensure it does not introduce adverse impact to the hiring process. The key metric that OPM is seeking is “can the AI solution deliver faster, more accurate evaluations of applicant qualifications when compared to experienced HR Specialists?”Development and AcquisitionU.S. Office of Personnel Management
OPM-0001-2023OPMHRS/USAJOBSSkills matching on Open OpportunitiesThe website uses Skills engine, a third party vendor, to provide personalized recommendations to users based on user input text and opportunity descriptionsOperation and MaintenanceNatural Language ProcessingU.S. Office of Personnel Management
OPM-0002-2023OPMHRS/USAJOBSSimilar Job RecommendationsUSAJOBS is planning to use natural language processing to provide better matches between posted job opportunities in order to help users identify opportunities of interest.Development and AcquisitionNatural Language ProcessingU.S. Office of Personnel Management
OPM-0003-2023OPMRS/RORetirement Services (RS) Chat BotA chatbot is a computer program that uses artificial intelligence (AI) and natural language processing to understand customer questions and automate responses to them, simulating human conversation. Retirement Services uses the chatbot to answer user questions related to Survivor Benefits. The bot initially started with a set of 13 questions and continues to grow based on reviews of user interaction.Operation and MaintenanceNatural Language ProcessingU.S. Office of Personnel Management
SSA-0000-2023SSAOffice of Analytics, Review, and OversightModernized Development Worksheet (MDW)This process uses AI to review textual data that is part of claim development tasks so it can be categorized into workload topics using natural language processing to facilitate faster technician review.Social Security Administration
SSA-0001-2023SSAOffice of Analytics, Review, and OversightAnomalous iClaim Predictive ModelThe anomalous iClaim predictive model is a machine learning model that identifies high-risk iClaims. These claims are then sent to Operations for further review before additional action is taken to adjudicate the claims.Social Security Administration
SSA-0002-2023SSAOffice of Analytics, Review, and OversightPre-Effectuation Review / Targeted Denial Review ModelsThese review models use machine learning to identify cases with greatest likelihood of disability eligibility determination error and refer them for quality review checks.Social Security Administration
SSA-0003-2023SSAOffice of Analytics, Review, and OversightRep Payee Misuse ModelThis model uses machine learning to estimate the probability of resource misuse by representative payees and flag the cases for a technician to examine.Social Security Administration
SSA-0004-2023SSAOffice of Analytics, Review, and OversightCDR ModelThis model uses machine learning techniques to identify disability cases with the greatest likelihood of medical improvement and flag them for a coninuing disability review.Social Security Administration
SSA-0005-2023SSAOffice of Analytics, Review, and OversightSSI Redetermination ModelThis model uses machine learning to identify supplemental security income cases with highest expected overpayments due to changes in financial eligibility and flag them for technician review.Social Security Administration
SSA-0006-2023SSAOffice of Analytics, Review, and OversightMedicare Part D Subsidy ModelThis model uses machine learning to identify cases most likely to have incorrect Medicare Part D subsidies and flag them for technician review.Social Security Administration
SSA-0007-2023SSAOffice of Analytics, Review, and OversightPATH ModelThis model uses machine learning to identify cases likely to receive an allowance at the hearing level and refer them to administrative law judges or senior adjudicators for prioritized review.Social Security Administration
SSA-0008-2023SSAOffice of Analytics, Review, and Oversight; Office of Hearing Operations, Office of Disability SystemsInsightInsight is decision support software used by hearings and appeals-level Disability Program adjudicators to help maximize the quality, speed, and consistency of their decision making. Insight analyzes the free text of disability decisions and other case data to offer adjudicators real-time alerts on potential quality issues and case-specific reference information within a web application. It also offers adjudicators a series of interactive tools to help streamline their work. Adjudicators can leverage these features to speed their work and fix issues before the case moves forward (e.g. to another reviewing employee or to the claimant). Insight�s features are powered by several natural language processing and artificial intelligence packages and techniques.Social Security Administration
SSA-0009-2023SSAOffice of Disability Determinations, Office of Disability Information SystemsIntelligent Medical Language Analysis Generation (IMAGEN)IMAGEN is an IT Modernization Disability Analytics & Disability Decision Support (ADDS) Product that will provide new tools and services to visualize, search and more easily identify relevant clinical content in medical records. These tools and services will improve the efficiency and consistency of disability determinations and decisions and provide a foundation for machine-based decisional guidance. IMAGEN will transform text to data and enable disability adjudicators to leverage various machine learning technologies like Natural Language Processing (NLP) and predictive analytics and will support other high-priority agency initiatives such as fraud prevention and detection.Social Security Administration
SSA-0010-2023SSAOffice of Disability Information Systems, Office of Hearing Operations, Office of Appellate OperationsDuplicate Identification Process (DIP)Duplicate Identification Process's (DIP's) objective is to help the user to�identify and flag�and mark duplicates�more efficiently, reducing the amount�of time spent to review�cases for�hearings.�DIP uses artificial�intelligence software in the form of image recognition technology to accurately�identify duplicates consistent with SSA�policy.?Social Security Administration
SSA-0011-2023SSAOffice of Disability Information Systems, Office of Hearing Operations, Office of Appellate OperationsHandwriting recognition from formsAI performs OCR against handwritten entries on specific standard forms submitted by clients. This use case is in support of an Robtic Process Automation effort as well as a standalone use.Social Security Administration
SSA-0012-2023SSAOffice of Retirement of Disability ProgramsQuick Disability Determinations ProcessThe Quick Disability Determinations (QDD) process uses a computer-based predictive model to screen initial applications to identify cases where a favorable disability determination is highly likely and medical evidence is readily available. The Agency bases the QDD model�s predictive scores on historical data from application forms completed by millions of applicants. By identifying QDD cases early in the process, the Social Security Administration can prioritize this workload and expedite case processing. The Agency routinely refines the QDD model to reflect the characteristics of the recent applicant population and optimize its ability to identify strong candidates for expedited processing.Social Security Administration
SSA-0013-2023SSAOffice of SystemsMobile Wage Reporting (MOBWR)Mobile Wage Reporting uses AI to extract text/data from scanned images/documents represeting pay stubs or payroll information to enable faster processing.Social Security Administration
TREAS-0000-2023TREASAccount Management ChatbotThe Accounts Management Chatbot leverages a natural language
understanding model within the eGain intent engine. This NLU maps
utterances to a specific intentsand returns the appropriate knowledge
article."Operation and MaintenanceDepartment of Treasury
TREAS-0001-2023TREASAppeals Case MemorandumThe Appeals Case Memorandum (ACM) leverages natural language
processing capabilities to assist with extractionconsolidationand labeling
of unstructured text from IRS ACM documentsautomatic identification of
key informationand processing results into a structured format. The
outcome of this processes is for IRS staff to review appeals information for
insightswhich can be used upstream to enhance case qualityconsistency
and performance. Summary of results involve detailed analysis on text
relationshipsissuesand citation narrative text paragraphs to provide
insight on issues commonly adjusted during the appeals process."ImplementationDepartment of Treasury
TREAS-0002-2023TREASCoin quality inspection systemAutomated coin visual inspection tools to search for defects on production
lines. Currently each coining press operator manually inspects coins for
quality. Improve quality and eliminate waste. Researching feasibility and
tools."InitiationDepartment of Treasury
TREAS-0003-2023TREASCollection Chat BotThe Natural Language Understanding (NLU) model will be located inside
the eGain intent engine. This NLU will take customer typed text input aka
– Utterances. It will map the utterance to a specific intent and return the
appropriate knowledge article."In production: less than six monthsDepartment of Treasury
TREAS-0004-2023TREASCollection Voice BotThe Nuance Natural Language Understanding (NLU) model will be located
inside the Automated Collections IVR (ACI) main menu. This NLU will take
customer speech input aka – Utterances. It will map the utterance to a
specific intent and direct the taxpayer down to a certain call path."In production: less than six monthsDepartment of Treasury
TREAS-0005-2023TREASCX AnalyticsIRS' Customer Experience (CX) Analytics is a capability that uses multiple,
customer service-related data sources to identify
issues/anomalies/improvement opportunities across the customer service
channel modes."ImplementationDepartment of Treasury
TREAS-0006-2023TREASDATA ActThe Digital Accountability and Transparency Act (DATA) Act Bot automates
verifying that IRS Federal Procurement Data System (FPDS) reporting
matches the information in contract documents (e.g. dollar amounts
dateslocation of work). Natural language processing is used to extract
unstructured information from contract documents. F1 scores are used to
measure performance of validation models for each specific data element."Planned (not in production)Department of Treasury
TREAS-0007-2023TREASInventory Item Replenishment MLR Modeling Pilot - Phase 1bThe Bureau of Engraving and Printing wanted to establish a proof of
concept (POC) for Predictive Analytics at the BEP. This POC consisted of
developing a Logistic Regression model for the Office of Supply Chain
Management (OSCM)to predict whether an item would be delivered by
the specified ""Need by Date"". This is the date that the BEP needs the
material in its facility and is set automatically to 128 days when a purchase
order (PO) is approved in the system. The model utilizes historical
requisitionvendorand item specific data to come up with binary (0 or 1)
predictionswhich are then used to determine whether an item will be
delivered on-time or if the OSCM should expect a delay. If the model
outputs a 1we expect that the item will be delayed and the OSCM can be
proactive in their decision making to prepare for a potential inventory
shortage."Development and AcquisitionDepartment of Treasury
TREAS-0008-2023TREASInventory Item Replenishment MLR Modeling POC - Phase 1aThe Bureau of Engraving and Printing wanted to establish a proof of
concept (POC) for Predictive Analytics at the BEP. This POC consisted of
developing a Multiple Linear Regression (MLR) model to predict Processing
Lead Times for the Office of Supply Chain Management (OSCM).
Processing Lead Times are the numbers of days it takes an item to be
delivered to the target facility from the time the purchase order (PO) was
approved. The model utilizes historical requisitionvendorand item
specific data to come up with numerical predictionswhich are then used
to determine whether an item will be delivered on-time or if the OSCM
should expect a delay. If a delay is expectedthe OSCM can be proactive in
their decision making to prepare for a potential inventory shortage."Development and AcquisitionDepartment of Treasury
TREAS-0009-2023TREASInventory Item Replenishment MLR Modeling POC - Phase 2The Bureau of Engraving and Printing wanted to operationalize a model
using their newly deployed Cloudera Data Science Workbench (CDSW)
application to predict whether an item would be delivered by the vendor
Promised Date. This date is the date the vendor promises an item to be
delivered to BEP. The model utilizes historical requisitionvendorand
item specific data to come up with binary (0 or 1) predictionswhich are
then used to determine whether an item will be delivered on-time or if the
OSCM should expect a delay. If the model outputs a 1we expect that the
item will be delayed and the OSCM can be proactive in their decision
making to prepare for a potential inventory shortage."Development and AcquisitionDepartment of Treasury
TREAS-0010-2023TREASNRP RedesignDeploy state-of-the-art AI machine learning methods to provide a lower
opportunity cost method of estimating a compliance baseline to support
tax gap estimationimproper payments reportingdevelopment and
validation of workload identification and selection modelsand inform
policy analysis. System inputs require existing NRP data which provide an
acceptable level of precision and quality for an acceptable level of data
quality output."In production: less than one yearDepartment of Treasury
TREAS-0011-2023TREASPredictive equipment maintenance systemPredictive maintenance to increase equipment uptime, improve safety,
lower maintenance cost. Researching feasibility and tools."InitiationDepartment of Treasury
TREAS-0012-2023TREASTAS Virtual AssistantThe TAS Virtual Assistant Chatbot will capture utterances from
taxpayers/end-users to direct them to helpful resources on IRS and TAS
public websites."InitiationDepartment of Treasury
TREAS-0013-2023TREASTaxpayer Accessibility - Machine Translation (MT)Taxpayer Accessibility Machine Translation (MT) is a SaaS based
Commercial Off-the-Shelf (COTS) product that uses Amazon Translatea
neural machine translation (NMT) service. The MT solution implements
customization features in the product which will have capabilities to
integrate existing Linguistics Policies Tools and Services (LPTS) translations
and workflows through a centralized repository formed by a collection of
existing and customized IRS glossaries to return translations from English
to Spanish (and Spanish to English) that more accurately reflect native-
tongue verbiage."ImplementationDepartment of Treasury
USAID-0000-2023USAIDBureau for Development, Democracy, and Innovation (DDI)Media Early Warning System (MEWS)To detect narratives and trends in social media alterations of images and video in order to find and counteract malign narrativesInitiationU.S. Agency for International Development
USAID-0001-2023USAIDBureau for Development, Democracy, and Innovation (DDI)Gender differentiated credit scoringUniversity of California, Berkeley, is building a machine learning model to conduct gender differentiated credit scoring for customers of Rappicard in Mexico. They will compare this ML model to Rappi's "status quo" model to determine whether a gender differentiated model leads to greater access to credit for women.InitiationXGBoost algorithm with parameters tuned via random hyperparameter search using 5-fold cross validation on the training dataset for 60 iterations (resulting in at least a 95% chance of finding a hyperparameter combination in the best 5% of combinations). The scores resulting from the XGBoost are calibrated via Platt scaling so that model scores can be interpreted as default probabilities. These is standard method for training credit scoring algorithms in the industry.U.S. Agency for International Development
USAID-0002-2023USAIDBureau for Development, Democracy, and Innovation (DDI)Machine Learning for PeaceObjective 1 under the Illuminating New Solutions and Programmatic Innovations for Resilient Spaces’ (INSPIRES). Includes program activities
and website - https://web.sas.upenn.edu/mlp-devlab/"Development and AcquisitionU.S. Agency for International Development
USAID-0003-2023USAIDBureau for Development, Democracy, and Innovation (DDI)Long-term impacts of land-use/land-cover dynamics on surface water quality in Botswana’s reservoirs using satellite data and artificial intelligence methods: Case study of the Botswana’s Limpopo River Basin (1984-2019)For water supply, semi-arid Botswana relies on the reservoirs within the Botswana’s LRB. Reservoirs are particularly susceptible to the negative impacts of land-use and land-cover (LULC) activities and runoff because of their complex dynamics, relatively longer water residence times, and their role as an integrating sink for pollutants from their drainage basins. Despite these interrelationships and significance in regional and global economic stability, land and water (L-W) are often treated in “silos”. To understand the complex L-W nexus within the LRB, this study will use data-driven artificial intelligence for quantitative determination of the relationships between LULC change, together with socioeconomic development indicators and climate change, and their impacts on water quality and availability within the basin, both for 1984-2019 and to predict future scenarios (2020-2050). To advance data acquisition for LULC analysis and climate change, the study utilizes optical Earth-observation and meteorological satellite data. To provide near real-time and cost-effective approach for continuous monitoring of reservoir water quality within the basin, the study will develop empirical models for water quality estimation and water quality index mapping using 35-years of in-situ water quality measurements and water spectral observations using drone-borne spectrometer and optical satellite imagery through regression modeling and geospatial methods.Development and AcquisitionU.S. Agency for International Development
USAID-0004-2023USAIDBureau for Development, Democracy, and Innovation (DDI)Morogoro youth empowerment through establishment of social innovation (YEESI) lab for problem-centered training in machine visionThe project proposes to establish a social innovation lab for a machine vision program that will be used by youth in the Morogoro region of Tanzania. There are young people in the area who have studied information technologies and allied sciences, and while most of them can write computer programs, they cannot solve machine vision problems. This project aims to increase awareness among the youth of Morogoro and nearby regions to address machine vision problems in agriculture. Machine vision is a new and understudied practice in Tanzania; hence, this project will contribute to efforts in the creation of scientific societies that address the most pressing problems faced by more than 80% of Tanzania’s population who engage in farming. The main agricultural problems can be classified into five categories, as explained below: (1) Disease Detection and Classification: The project will develop experts who will solve problems in disease identification using machine vision for most of the diseases in crops and livestock, which are misdiagnosed by farmers. (2) Weed Classification: The project will develop algorithms that accurately identify weeds and contribute to the growing scientific database for automatic weed detection. (3) Pest Detection and Classification: Appropriate tools using machine vision for Integrated Pest Management (IPM) are needed in Tanzania, as IPM has been hindered due to a lack of extension officers to train farmers on mitigation and identification of pests in agriculture. (4) Crop Seedlings Stand Count and Yield Estimation: Use of machine vision and drones instead of scouting manually to estimate stand counts would provide appropriate mitigation strategies for replanting that would be beneficial to commercial farmers. Also of importance are algorithms to sort and estimate yield by counting the fruits and to estimate the amount of other agricultural products. (5) Crop Vigor Estimation: Most farmers apply inputs evenly across the farm because they cannot predetermine crop vigor. Accurate estimation of crop health would help farmers to mitigate the problems earlier and improve crop performance and avoid failure. Algorithms to determine crop vigor developed in this project will contribute to the improvement of the methods to estimate crop performance earlier.Development and AcquisitionU.S. Agency for International Development
USAID-0005-2023USAIDBureau for Development, Democracy, and Innovation (DDI)Project VikelaUse AI to detect illegal rhino horn in airplane luggage X-Ray scannersOperation and MaintenanceMachine LearningU.S. Agency for International Development
USAID-0006-2023USAIDBureau for Global Health (GH)Using ML for predicting treatment interruption among PLHIV in NigeriaUsing data from USAID funded Strengthening Integrated Delivery of HIV/AIDS Services (SIDHAS) project in Nigeria we trained and tested an algorithm that can be used for predicting the probability that someone newly initiated on ART will interrupt treatment. The algorithm has been successfully integrated into the Lafiya Management Information System (LAMIS), the individual-level client level electronic medical record system. Each week the outputs, for each new patient is shared with staff at the health facilities and those at high risk are provided with more intensive follow up support to reduce the risk of treatment interruption. We also conducted a qualitative assessment among to health care workers at the facilities to determine their perception of ML and determine what additional support are required for institutionalizing ML into their routine work.Development and AcquisitionU.S. Agency for International Development
USAID-0007-2023USAIDBureau for Global Health (GH)Breakthrough RESEARCH’s Social Media ListeningSocial media listening draws on machine learning to synthesize and organize the vast quantities of data shared over social media platforms. Breakthrough RESEARCH carried out social listening on 12,301 social media posts in Nigeria to explore how gender-related online conversations manifest themselves and whether they have changed in the last five years. Using Crimson Hexagon’s machine learning algorithm, “Brightview,” publicly available social media content originating in the countries of interest was scraped by the algorithm, for posts relevant to RH/FP and youth. The resulting social media posts were then classified by topic, using language detected in the content. This provided a dataset categorizing conversations into overarching topics, allowing analyses to uncover key trends in topic specific conversation volume, insights about misinformation, attitudes and social norms, and more. The machine learning algorithm was able to identify relevant social media content. The 12,301 social media posts were qualitatively assessed and categorized, allowing researchers to monitor and track social media conversations far more expansively than allowed by research methods more traditionally used in public health and SBC programs.Operation and MaintenanceU.S. Agency for International Development
USAID-0008-2023USAIDBureau for Global Health (GH)Serbia: AI predictions for the utilization of hospital bedsAI technology was used to predict bed occupancy at hospitals with MoH data from 2019, with an overall median error by department around 20%. This was a proof-of-concept model developed at the request of the Institute of Public Health (IPH) Batut to understand how AI can work and the value add. CHISU was asked to subsequently focus on a different use case (waiting list optimization for scheduled imaging diagnostics services, specifically CT and MRI), which is considered higher priority to demonstrate the implementation of the national AI strategy and the effect of AI in data use for decision making by the government, and will be addressed in the 2023-4.ImplementationU.S. Agency for International Development
USAID-0009-2023USAIDBureau for Global Health (GH)Mali: AI predictions for the optimization of the allocation of the distribution of COVID-19 vaccinesAI technology was used to develop a pandemic preparedness AI model to support allocation of COVID-19 vaccines based on a multi-tiered strategy for target populations: 1) hotspots for COVID-19 positive cases and 2) pregnant/breastfeeding women using DHIS2 data. This was a proof-of-concept model.ImplementationU.S. Agency for International Development
USAID-0010-2023USAIDBureau for Global Health (GH)Indonesia: AI predictions for improving forecasts for TB drugsAI technology will be used to develop a forecasting AI model for TB sensitive drugs to inform more accurate annual quantification exercises for the MoH linked to their national data integration platform SatuSehatInitiationU.S. Agency for International Development
USAID-0011-2023USAIDBureau for Latin America and the CaribbeanNASA SERVIR - Bias Correcting Historical GEOGloWS ECMWF Streamflow Service (GESS) data using Machine Learning (ML) TechniquesGEOGloWS ECMWF Streamflow Service (GESS) helps to organize the international community engaged in the hydrologic sciences, observations, and their application to forecasting and provides a forum for government-to-government collaboration, and engagement with the academic and private sectors to achieve the delivery of actionable water information. Since the formal creation of the initiative in 2017, the most significant element of GEOGloWS has been the application of Earth Observations (EO) to create a system that forecasts flow on every river of the world while also providing a 40-year simulated historical flow.
This application uses Long Short Term Memory (LSTM) Model with the time series of discharge data to bias correct the globally available GESS discharge information locally."Development and AcquisitionU.S. Agency for International Development
USAID-0012-2023USAIDBureau for Latin America and the CaribbeanNASA SERVIR - Using artificial intelligence to forecast harmful algae blooms in Lake Atitlán, GuatemalaThis application uses machine learning with Earth observations and weather-modeled data to forecast daily algal blooms in Lake Atitlán, Guatemala. The forecasting system is being used by Lake Authorities, such as the Authority for Sustainable Management of the Lake Atitlan Basin and its surroundings (AMSCLAE), to inform their Harmful Algal Blooms Alert System. This work is also supported by National Geographic and Microsoft through their Artificial Intelligence (AI) for Innovation grants.ImplementationU.S. Agency for International Development
USAID-0013-2023USAIDBureau for Latin America and the CaribbeanNASA SERVIR - Mapping urban vulnerability using AI techniquesThis activity will improve urban vulnerability assessment in key population centers, particularly by co-creating replicable methods to use satellite imagery to map informal settlements.InitiationU.S. Agency for International Development
USDA-0000-2023USDAUSDAAPHISPredictive modeling of invasive pest species and category at the port of entry using machine learning algorithmsMacine learning algorithms are used to develop with inspection data and improve prediction ability of detecting invasive/quarantine significant pests at the port of entry.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0001-2023USDAUSDAAPHISDetection of pre-symptomatic HLB infected citrusIdentify pixels with HLB infection signature in multispectral and thermal imageryOperation and ManagementVisual AnalysisDepartment of Agriculture
USDA-0002-2023USDAUSDAAPHISHigh throughput phenotyping in citrus orchardsLocate, count, and categorize citrus trees in an orchard to monitor orchard healthOperation and ManagementMachine LearningDepartment of Agriculture
USDA-0003-2023USDAUSDAAPHISDetection of aquatic weedsIdentify and locate aquatic weedsOperation and ManagementMachine LearningDepartment of Agriculture
USDA-0004-2023USDAUSDAAPHISAutomated Detection & Mapping of Host Plants from Ground Level ImageryGenerate maps of target trees from ground-level (streetview) imageryDevelopment and AcquisitionMachine LearningDepartment of Agriculture
USDA-0005-2023USDAUSDAAPHISStandardization of cut flower business names for message set dataNatural language processing technique. Data are cleaned (e.g., remove punctuation) to facilitate matching. Cosine similarity is calculated, similar terms are matched, and the results are output.ImplementationNatural Language ProcessingDepartment of Agriculture
USDA-0006-2023USDAUSDAAPHISApproximate string or fuzzy matching, used to automate matching similar, but not identical, text in administrative documentsThe algorithm computes a string similarity metric which can be used to classify similar strings into a single category, reducing information duplication and onerous, manual error-checkingOperation and ManagementFuzzy matchingDepartment of Agriculture
USDA-0007-2023USDAUSDAAPHISTraining machine learning models to automatically read file attachments and save information into a more convenient Excel format.Artificial intelligence used to automate document processing and information extraction. Program managers often need information from specific form fields that are sent as PDF email attachments. Many emailed documents are received each day, making manually opening each attachment and copying the needed information too time-consuming.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0008-2023USDAUSDAAPHISArtificial Intelligence for correlative statistical analysisAI-type statistical techniques are used to model predictive relationships between variables. We routinely use modeling approaches such as random forest, artificial neural networks, k-nearest neighbor clustering, and support vector machines, for statistical prediction.Operation and ManagementNeural networks,ClusteringDepartment of Agriculture
USDA-0009-2023USDAUSDAARS4% Repair DashboardThe model reviews the descriptions of expenses tagged to repairs and maintenance and classifies expenses as "repair" or "not repair" based on keywords in context.Operation and ManagementNatural Language ProcessingDepartment of Agriculture
USDA-0010-2023USDAUSDAARSARS Project MappingNLP of research project plans including term analysis and clustering enables national program leaders to work with an interactive dashboard to find synergies and patterns within and across the various ARS research program portfolios.Operation and ManagementNatural Language ProcessingDepartment of Agriculture
USDA-0011-2023USDAUSDAARSNAL Automated indexingCogito (vendor) software, uses AI for automated subject indexing to annotate peer reviewed journal articles (~500,000 annually) using the National Ag Library Thesaurus concept space (NALT). Only NALT concepts are annotated as metadata to content in the Library's bibliographic citation database, AGRICOLA, PubAg, and Ag Data Commons.Operation and ManagementNatural Language ProcessingDepartment of Agriculture
USDA-0012-2023USDAUSDAERSDemocratizing DataThe purpose of this project is to use AI tools, machine learning and natural language processing to understand how publicly-funded data and evidence are used to serve science and society.ImplementationNatural Language ProcessingDepartment of Agriculture
USDA-0013-2023USDAUSDAERSWestatA competition to find automated, yet effective, ways of linking USDA nutrition information to 750K food items in a proprietary data set of food purchases and acquisitions. Competing teams used a number of AI methods including Natural Language Processing (NLP), random forest, and semantic matching.Operation and ManagementNatural Language Processing,Machine Learning,OtherDepartment of Agriculture
USDA-0014-2023USDAUSDAFNSRetailer Receipt AnalysisThe Retailer Receipt Analysis is a Proof of Concept (POC) that uses Optical Character Recognition (OCR), an application of artificial intelligence on a sample (no more than 1000) of FNS receipt and invoice data. Consultants will use this data to demonstrate how the existing manual process can be automated, saving staff time, ensuring accurate review, and detecting difficult patterns. The goal of this POC will pave the way for a review system that (1) has an automated workflow and learns from analyst feedback (2) can incorporate know SNAP fraud patterns, look for new patterns, and visualize alerts on these patterns on retailer invoices and receipts.Development and AcquisitionMachine LearningDepartment of Agriculture
USDA-0015-2023USDAUSDAFNSNutrition Education & Local Access DashboardThe goal of the this Dashboard is to provide a county-level visualization of FNS nutrition support, specifically nutrition education and local food access, alongside other metrics related to hunger and nutritional health. As part of this dashboard, the team developed a K-means clustering script to group States by 7 different clustering options: Farm to School Intensity & Size, Program Activity Intensity, Ethnicity & Race, Fresh Food Access, School Size, and Program Participation. This allows users to find like-minded, or similar, States based on any of these characteristics, opening up avenues for partnerships with States that they otherwise may not have considered.Operation and ManagementMachine Learning,K-Means Clustering,Visual AnalysisDepartment of Agriculture
USDA-0016-2023USDAUSDAFPACLand Change Analysis Tool (LCAT)We employ a random forest machine learning classifier to produce high resolution land cover maps from aerial and/or satellite imagery. Training data is generate from a custom-built web application. We built and operate a 192-node docker cluster to parallize CPU-intensive processing tasks. We are publishing results through a publicly available Image service. To date we have mapped over 600 million acres and have generated over 700 thousand traiing samples.Operation and ManagementMachine Learninghttps://cran.r-project.org/web/packages/randomForest/randomForest.pdf
https://cran.r-project.org/web/packages/clhs/clhs.pdf
Department of Agriculture
USDA-0017-2023USDAUSDAFederal CDO CouncilOCIO/CDO Council Comment Analysis ToolThe Comment Analysis pilot has shown that a toolset leveraging recent advances in Natural Language Processing (NLP) can aid the regulatory comment analysis process. We developed tools that help comment reviewers identify the topics and themes of comments, as well as group comments that are semantically similar. Tools like these offer significant value by creating efficiencies through novel insights and streamlined processing of comments, reducing duplicative, upfront development efforts across government, and ultimately realizing cost savings for agencies and the USG.
,Development and Acquisition,Natural Language Processing,https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkenambrose-GSA%2FCDO-Council-Public-Comment-Analysis-Project.git&data=04%7C01%7C%7Cae1c2e505f5f4cb50f0e08d9453f6ede%7Ced5b36e701ee4ebc867ee03cfa0d4697%7C0%7C0%7C637616961007063,Department of Agriculture
USDA-0018-2023USDAUSDAForest ServiceEcosystem Management Decision Support System (EMDS)EMDS is a spatial decision support system for landscape analysis and planning that runs as a component of ArcGIS and QGIS. Users develop applications for their specific problem that may use any combination of four AI engines for 1) logic processing, 2) multi-criteria decision analysis, 3) Bayesian networks, and Prolog-based decision trees.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0019-2023USDAUSDAForest ServiceWildland Urban Interface - Mapping Wildfire LossThis is a proof-of-concept study to investigate the use of machine learning (deep learning / convolutional neural networks) and object-based image classification techniques to identify buildings, building loss, and defensible space around buildings before and after a wildfire event in wildland-urban interface settings.Development and AcquisitionMachine Learninghttps://www.sciencedirect.com/science/article/pii/S221242092100501XDepartment of Agriculture
USDA-0020-2023USDAUSDAForest ServiceCLT Knowledge DatabaseThe CLT knowledge database catalogs cross-laminated timber information in an interface that helps users find relevant information. The information system uses data aggregator bots that search the internet for relevant information. These bots search for hundreds of keywords and use machine learning to determine if what is found is relevant. The search engine uses intelligent software to locate and update pertinent CLT references, as well as categorize information with respect to common application and interest areas. As of 2/24/2022, the CLT knowledge database has cataloged >3,600 publications on various aspects of CLT. This system fosters growth of mass timber markets by disseminating knowledge and facilitating collaboration among stakeholders, and by reducing the risk of duplication of efforts. Manufacturers, researchers, design professionals, code officials, government agencies, and other stakeholders directly benefit from the tool, thereby supporting the increasing use of mass timber, which benefits forest health by increasing the economic value of forests.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0021-2023USDAUSDAForest ServiceRMRS Raster UtilityRMRS Raster Utility is a .NET object oriented library that simplifies data acquisition, raster sampling, and statistical and spatial modeling while reducing the processing time and storage space associated with raster analysis. It includes machine learning techniques.Operation and ManagementMachine Learninghttps://collab.firelab.org/software/projects/rmrsrasterDepartment of Agriculture
USDA-0022-2023USDAUSDAForest ServiceTreeMap 2016TreeMap 2016 provides a tree-level model of the forests of the conterminous United States. It matches forest plot data from Forest Inventory and Analysis (FIA) to a 30x30 meter (m) grid. TreeMap 2016 is being used in both the private and public sectors for projects including fuel treatment planning, snag hazard mapping, and estimation of terrestrial carbon resources. A random forests machine-learning algorithm was used to impute the forest plot data to a set of target rasters provided by Landscape Fire and Resource Management Planning Tools (LANDFIRE: https://landfire.gov). Predictor variables consisted of percent forest cover, height, and vegetation type, as well as topography (slope, elevation, and aspect), location (latitude and longitude), biophysical variables (photosynthetically active radiation, precipitation, maximum temperature, minimum temperature, relative humidity, and vapour pressure deficit), and disturbance history (time since disturbance and disturbance type) for the landscape circa 2016.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0023-2023USDAUSDAForest ServiceLandscape Change Monitoring System (LCMS)The Landscape Change Monitoring System (LCMS) is a National landsat/sentinal remote sensing-based data produced by the USDA Forest Service for mapping and monitoring changes related to vegetation canopy cover, as well as land cover and land use. The process utilizes temporal change classifications together with training data in a supervised classification process for vegetation gain, and loss as well as land cover and use.Development and AcquisitionMachine Learning,Visual AnalysisDepartment of Agriculture
USDA-0024-2023USDAUSDAForest ServiceGeospatial and Remote Sensing Training CoursesSeveral courses are offered which teach the use of software and scripting which allow for machine learning. The courses change, but current topics include Intro and Advanced Change Detection, eCognition (software package), Geospatial Scripting for Google Earth Engine. Some of the courses show how to use Collect Earth Online.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0025-2023USDAUSDAForest ServiceForest Health Detection MonitoringMachine learning models are used to (1) upscale training data, using Sentinel-2, Landsat, MODIS, and lidar imagery, that was collected from both the field and high-resolution imagery to map and monitor stages of forest mortality and defoliation across the United States, and (2) to post-process raster outputs to vector polygons.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0026-2023USDAUSDANASSCropland Data LayerA machine learning algorithm is used to interpret readings from satellite-based sensors and CLASSIFY the type of crop or activity that falls in each 30 square meter pixel (a box of fixed size) on the ground. The algorithms are trained on USDA&%2339;s Farm Services Agency data and other sources of data as sources of "ground truth". It allows us to not only produce a classification, but to assess the accuracy of the classification as well. For commodities, like corn and soybeans, the CDL is highly accurate. The CDL has been produced for national coverage since 2008. Some summary and background about the CDL is available in a number of peer reviewed research papers and presentations
https://www.nass.usda.gov/Research_and_Science/Cropland/othercitations/index.php"Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0027-2023USDAUSDANASSList Frame Deadwood IdentificationThe deadwood model leverages boosted regression trees with inputs such as administrative linkage data, frame data, and historical response information as inputs, to produce a propensity score representing a relative likelihood of a farm operation being out of business. Common tree splits were identified using the model and combined with expert knowledge to develop a recurring process for deadwood clean up.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0028-2023USDAUSDANASSCensus of Agricuilture Response Propensity ScoresThe response propensity scores to the COA are derived from random forest models that use historical data, control data, and other survey data. These scores are used to help target more effective data collection.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0029-2023USDAUSDANIFAClimate Change Classification NLPThe model classifies NIFA funded projects as climate change related or not climate related through natural language processing techniques. The model input features include text fields containing the project's title, non-technical summary, objectives and keywords. The target is a dummy variable classification of projects as climate change related or not climate change related.Development and AcquisitionNatural Language ProcessingDepartment of Agriculture
USDA-0030-2023USDAUSDANRCSOperational water supply forecasting for western US riversWestern US water management is underpinned by forecasts of spring-summer river flow volumes made using operational hydrologic models. The USDA Natural Resources Conservation Service (NRCS) National Water and Climate Center operates the largest such forecast system regionally, carrying on a nearly century-old tradition. The NWCC recently developed a next-generation prototype for generating such operational water supply forecasts (WSFs), the multi-model machine-learning metasystem (M4), which integrates a variety of AI and other data-science technologies carefully chosen or developed to satisfy specific user needs. Required inputs are data around snow and precipitation from the NRCS Snow Survey and Water Supply Forecast program SNOTEL environmental monitoring network, but are flexible. In hindcasting test-cases spanning diverse environments across the western US and Alaska, out-of-sample accuracy improved markedly over current benchmarks. Various technical design elements, including multi-model ensemble modeling, autonomous machine learning (AutoML), hyperparameter pre-calibration, and theory-guided data science, collectively permitted automated training and operation. Live operational testing at a subset of sites additionally demonstrated logistical feasibility of workflows, as well as geophysical explainability of results in terms of known hydroclimatic processes, belying the black-box reputation of machine learning and enabling relatable forecast storylines for NRCS customers.Development and AcquisitionMachine LearningDepartment of Agriculture
USDA-0031-2023USDAUSDANRCSEcological Site Descriptions (machine learning)Analysis of over 20 million records of soils data and 20,000 text documents of ecological state and transition information.Development and AcquisitionMachine LearningDepartment of Agriculture
USDA-0032-2023USDAUSDANRCSConservation Effects Assessment ProjectThe goal is to predict conservation benefits at the field level. The model uses farmer survey data, APEX modeling results and environmental data.Development and AcquisitionMachine LearningDepartment of Agriculture
USDA-0033-2023USDAUSDANRCSDigital Imagery (no-change) for NRI programUsing neural networks and other AI technologies to detect no-changes in digital imagery for the NRI (national resources inventory) programInitiationneural networksDepartment of Agriculture
USDA-0034-2023USDAUSDAOASCRArtificial Intelligence SPAM Mitigation ProjectThe AI Solution invoves Robotic Process Automation + AI/ML model solution to automatically classify and remove spam and marketing emails that appear in civil rights complaints email channels. A significant portion of incoming OASCR emails are spam, marketing and phishing emails.
,Development and Acquisition,Machine LearningMachine Language Learning"Department of Agriculture
USDA-0035-2023USDAUSDAOCIOAcquisition Approval Request Compliance ToolA natural language processing (NLP) model was developed to utilize the text in procurement header and line descriptions within USDA's Integrated Acquisition System (IAS) to determine the likelihood that an award is IT-related, and therefore might require an AAR. The model uses the text characteristics for awards that have an AAR number entered into IAS and then calculates the probability of being IT-related for those procurements that did not have an AAR Number entered in IAS.Operation and ManagementNatural Language ProcessingDepartment of Agriculture
USDA-0036-2023USDAUSDAOCIOIntelligent Ticket RoutingRoutes BMC Remedy tickets to proper work group automatically utilizing python, jupyterhub, scikit learn, gitlab, flask, gunicorn, nginx, erms.Operation and ManagementMachine LearningDepartment of Agriculture
USDA-0037-2023USDAUSDAOCIOPredictive Maintenance ImpactsPredict impacts of DISC maintenance on infrastructure items. Utilizes: einblick, mysql, python, linux, tableauOperation and ManagementMachine LearningDepartment of Agriculture
USDA-0038-2023USDAUSDAOSSPVideo Surveillance SystemThe Video Surveillance System: the VSS system design will include a video management system, NVRs, DVRs, encoders, fixed cameras, Pan and Tilt cameras, network switches, routers, IP cables, equipment racks and mounting hardware. The Video Surveillance System (VSS)- shall control multiple sources of video surveillance subsystems to collect, manage, and present video clearly and concisely. VMS shall integrate the capabilities of each subsystem across single or multiple sites, allowing video management of any compatible analog or digital video device through a unified configuration platform and viewer. Disparate video systems are normalized and funneled through a shared video experience. Drag and drop cameras from the Security Management System hardware tree into VMS views and leverage Security Management System alarm integration and advanced features that help the operator track a target through a set of sequential cameras with a simplified method to select a new central camera and surrounding camera views.Operation and ManagementVisual AnalysisDepartment of Agriculture
VA-0000-2023VAArtificial Intelligence physical therapy appThis app is a physical therapy support tool.  It is a data source agnostic tool which takes input from a variety of wearable sensors and then analyzes the data to give feedback to the physical therapist in an explainable format. Department of Veterans Affairs
VA-0001-2023VAArtificial intelligence coach in cardiac surgeryThe artificial intelligence coach in cardiac surgery infers misalignment in team members’ mental models during complex healthcare task execution. Of interest are safety-critical domains (e.g., aviation, healthcare), where lack of shared mental models can lead to preventable errors and harm. Identifying model misalignment provides a building block for enabling computer-assisted interventions to improve teamwork and augment human cognition in the operating room.Department of Veterans Affairs
VA-0002-2023VAAI CureAICURE is a phone app that monitors adherence to orally prescribed medications during clinical or pharmaceutical sponsor  drug studies.Department of Veterans Affairs
VA-0003-2023VAAcute kidney injury (AKI)This project, a collaboration with Google DeepMind, focuses on detecting acute kidney injury (AKI), ranging from minor loss of kidney function to complete kidney failure. The artificial intelligence can also detect AKI that may be the result of another illness.Department of Veterans Affairs
VA-0004-2023VAAssessing lung function in health and diseaseHealth professionals can use this artificial intelligence to determine predictors of normal and abnormal lung function and sleep parameters.Department of Veterans Affairs
VA-0005-2023VAAutomated eye movement analysis and diagnostic prediction of neurological diseaseArtificial intelligence  recursively analyzes previously collected data to both improve the quality and accuracy of automated algorithms, as well as to screen for markers of neurological disease (e.g. traumatic brain injury, Parkinson's, stroke, etc).Department of Veterans Affairs
VA-0006-2023VAAutomatic speech transcription engines to aid scoring neuropsychological tests.Automated speech transcription engines analyze the cognitive decline of older VA patients. Digitally recorded speech responses are transcribed using multiple artificial intelligence-based speech-to-text engines. The transcriptions are fused together to reduce or obviate the need for manual transcription of patient speech in order to score the neuropsychological tests.Department of Veterans Affairs
VA-0007-2023VACuraPatientCuraPatient is a remote tool that allows patients to better manage their conditions without having to see a provider.  Driven by artificial intelligence, it allows patients to create a profile to track their health, enroll in programs, manage insurance, and schedule appointments.Department of Veterans Affairs
VA-0008-2023VADigital command centerThe Digital Command Center seeks to consolidate all data in a medical center and apply predictive prescriptive analytics to allow leaders to better optimize hospital performance.  Department of Veterans Affairs
VA-0009-2023VADisentangling dementia patterns using artificial intelligence on brain imaging and electrophysiological dataThis collaborative effort focuses on developing a deep learning framework to predict the various patterns of dementia seen on MRI and EEG and explore the use of these imaging modalities as biomarkers for various dementias and epilepsy disorders.  The VA is performing retrospective chart review to achieve this.Department of Veterans Affairs
VA-0010-2023VAMachine learning (ML) for enhanced diagnostic error detection and ML classification of protein electrophoresis textResearchers are performing chart review to collect true/false positive annotations and construct a vector embedding of patient records, followed by similarity-based retrieval of unlabeled records "near" the labeled ones (semi-supervised approach). The aim is to use machine learning as a filter, after the rules-based retrieval, to improve specificity. Embedding inputs will be selected high-value structured data pertinent to stroke risk and possibly selected prior text notes.Department of Veterans Affairs
VA-0011-2023VABehavidenceBehavidence is a mental health tracking app. Veterans download the app onto their phone and it compares their phone usage to that of a digital phenotype that represents people with confirmed diagnosis of mental health conditions. Department of Veterans Affairs
VA-0012-2023VAMachine learning tools to predict outcomes of hospitalized VA patientsThis is an IRB-approved study which aims to examine machine learning approaches to predict health outcomes of VA patients.  It will focus on the prediction of Alzheimer's disease, rehospitalization, and Chlostridioides difficile infection.Department of Veterans Affairs
VA-0013-2023VANediser reports QANediser is a continuously trained artificial intelligence “radiology resident” that assists radiologists in confirming the X-ray properties in their radiology reports.  Nediser can select normal templates, detect hardware, evaluate patella alignment and leg length and angle discrepancy, and measure Cobb angles.Department of Veterans Affairs
VA-0014-2023VAPrecision medicine PTSD and suicidality diagnostic and predictive toolThis model interprets various real time inputs in a diagnostic and predictive capacity in order to forewarn episodes of PTSD and suicidality, support early and accurate diagnosis of the same, and gain a better understanding of the short and long term effects of stress, especially in extreme situations, as it relates to the onset of PTSD.Department of Veterans Affairs
VA-0015-2023VAPrediction of Veterans' Suicidal Ideation following Transition from Military ServiceMachine learning is used to identify predictors of veterans' suicidal ideation. The relevant data come from a web-based survey of veterans’ experiences within three months of separation and every six months after for the first three years after leaving military service.Department of Veterans Affairs
VA-0016-2023VAPredictModPredictMod uses artificial intelligence to determine if predictions can be made about diabetes based on the gut microbiome.Department of Veterans Affairs
VA-0017-2023VAPredictor profiles of OUD and overdoseMachine learning prediction models evaluate the interactions of known and novel risk factors for opioid use disorder (OUD) and overdose in Post-9/11 Veterans. Several machine learning classification-tree modeling approaches are used to develop predictor profiles of OUD and overdose. Department of Veterans Affairs
VA-0018-2023VAProvider directory data accuracy and system of record alignmentAI is used to add value as a transactor for intelligent identity resolution and linking.  AI also has a domain cache function that can be used for both Clinical Decision Support and for intelligent state reconstruction over time and real-time discrepancy detection.  As a synchronizer, AI can perform intelligent propagation and semi-automated discrepancy resolution.  AI adapters can be used for inference via OWL and logic programming.  Lastly, AI has long term storage (“black box flight recorder”) for virtually limitless machine learning and BI applications.Department of Veterans Affairs
VA-0019-2023VASeizure detection from EEG and videoMachine learning algorithms use EEG and video data from a VHA epilepsy monitoring unit in order to automatically identify seizures without human intervention.Department of Veterans Affairs
VA-0020-2023VASoKat Suicidial Ideation Detection EngineThe SoKat Suicide Ideation Engine (SSIE) uses natural language processing (NLP) to improve identification of Veteran suicide ideation (SI) from survey data collected by the Office of Mental Health (OMH) Veteran Crisis Line (VCL) support team (VSignals).Department of Veterans Affairs
VA-0021-2023VAUsing machine learning to predict perfusionists’ critical decision-making during cardiac surgeryA machine learning approach is used to build predictive models of perfusionists’ decision-making during critical situations that occur in the cardiopulmonary bypass phase of cardiac surgery. Results may inform future development of computerized clinical decision support tools to be embedded into the operating room, improving patient safety and surgical outcomes.Department of Veterans Affairs
VA-0022-2023VAGait signatures in patients with peripheral artery diseaseMachine learning is used to improve treatment of functional problems in patients with peripheral artery disease (PAD). Previously collected biomechanics data is used to identify representative gait signatures of PAD to 1) determine the gait signatures of patients with PAD and 2) the ability of limb acceleration measurements to identify and model the meaningful biomechanics measures from PAD data.Department of Veterans Affairs
VA-0023-2023VAMedication Safety (MedSafe) Clinical Decision Support (CDS)Using VA electronic clinical data, the Medication Safety (MedSafe) Clinical Decision Support (CDS) system analyzes current clinical management for diabetes, hypertension, and chronic kidney disease, and makes patient-specific, evidence-based recommendations to primary care providers.  The system uses knowledge bases that encode clinical practice guideline recommendations and an automated execution engine to examine multiple comorbidities, laboratory test results, medications, and history of adverse drug events in evaluating patient clinical status and generating patient-specific recommendationsDepartment of Veterans Affairs
VA-0024-2023VAPrediction of health outcomes, including suicide death, opioid overdose, and decompensated outcomes of chronic diseases.Using electronic health records (EHR) (both structured and unstructured data) as  inputs, this tool outputs deep phenotypes and predictions of health outcomes including suicide death, opioid overdose, and decompensated outcomes of chronic diseases.Department of Veterans Affairs
VA-0025-2023VAVA-DoE Suicide Exemplar ProjectThe VA-DoE Suicide Exemplar project is currently utilizing artificial intelligence to improve VA's ability to identify Veterans at risk for suicide through three closely related projects that all involve collaborations with the Department of Energy.Department of Veterans Affairs
VA-0026-2023VAMachine learning models to predict disease progression among veterans with hepatitis C virusA machine learning model is used to predict disease progression among veterans with hepatitis C virus.Department of Veterans Affairs
VA-0027-2023VAPrediction of biologic response to thiopurinesUsing CPRS and CDW data, artificial intelligence is used to predict biologic response to thiopurines among Veterans with irritable bowel disease.Department of Veterans Affairs
VA-0028-2023VAPredicting hospitalization and corticosteroid use as a surrogate for IBD flaresThis work examines data from 20,368 Veterans Health Administration (VHA) patients with an irritable bowel disease (IBD) diagnosis between 2002 and 2009. Longitudinal labs and associated predictors were used in random forest models to predict hospitalizations and steroid usage as a surrogate for IBD Flares.Department of Veterans Affairs
VA-0029-2023VAPredicting corticosteroid free endoscopic remission with Vedolizumab in ulcerative colitisThis work uses random forest modeling on a cohort of 594 patients with Vedolizumab to predict the outcome of corticosteroid-free biologic remission at week 52 on the testing cohort. Models were constructed using baseline data or data through week 6 of VDZ therapy.Department of Veterans Affairs
VA-0030-2023VAUse of machine learning to predict surgery in Crohn’s diseaseMachine learning analyzes patient demographics, medication use, and longitudinal laboratory values collected between 2001 and 2015 from adult patients in the Veterans Integrated Service Networks (VISN) 10 cohort. The data was used for analysis in prediction of Crohn’s disease and to model future surgical outcomes within 1 year.Department of Veterans Affairs
VA-0031-2023VAReinforcement learning evaluation of treatment policies for patients with hepatitis C virusA machine learning model is used to predict disease progression among veterans with hepatitis C virus.Department of Veterans Affairs
VA-0032-2023VAPredicting hepatocellular carcinoma in patients with hepatitis CThis prognostic study used data on patients with hepatitis C virus (HCV)-related cirrhosis in the national Veterans Health Administration who had at least 3 years of follow-up after the diagnosis of cirrhosis. The data was used to examine whether deep learning recurrent neural network (RNN) models that use raw longitudinal data extracted directly from electronic health records outperform conventional regression models in predicting the risk of developing hepatocellular carcinoma (HCC).Department of Veterans Affairs
VA-0033-2023VAComputer-aided detection and classification of colorectal polypsThis study is investigating the use of artificial intelligence models for improving clinical management of colorectal polyps. The models receive video frames from colonoscopy video streams and analyze them in real time in order to (1) detect whether a polyp is in the frame and (2) predict the polyp's malignant potential.Department of Veterans Affairs
VA-0034-2023VAGI Genius (Medtronic)The Medtronic GI Genius aids in detection of colon polyps through artificial intelligence.Department of Veterans Affairs
VA-0035-2023VAExtraction of family medical history from patient recordsThis pilot project uses TIU documentation on African American Veterans aged 45-50 to extract family medical history data and identify Veterans who are are at risk of prostate cancer but have not undergone prostate cancer screening.Department of Veterans Affairs
VA-0036-2023VAVA /IRB approved research study for finding colon polypsThis IRB approved research study uses  a randomized trial for finding colon polyps with artifical intelligence.Department of Veterans Affairs
VA-0037-2023VAInterpretation/triage of eye imagesArtificial intelligence supports triage of eye patients cared for through telehealth, interprets eye images, and assesses health risks based on retina photos. The goal is to improve diagnosis of a variety of conditions, including glaucoma, macular degeneration, and diabetic retinopathy.Department of Veterans Affairs
VA-0038-2023VAScreening for esophageal adenocarcinomaNational VHA administrative data is used to adapt tools that use electronic health records to predict the risk for esophageal adenocarcinoma.Department of Veterans Affairs
VA-0039-2023VASocial determinants of health extractorAI is used with clinical notes to identify social determinants of health (SDOH) information. The extracted SDOH variables can be used during associated health related analysis to determine, among other factors, whether SDOH can be a contributor to disease risks or healthcare inequality.Department of Veterans Affairs