Connectionists: NASA Big Data/Machine Learning Early Career Faculty Call for Proposals

Oza, Nikunj C. (ARC-TI) nikunj.c.oza at nasa.gov
Wed Feb 15 22:06:45 EST 2017


NASA Funding Opportunity for Early Career Faculty in Big Data, Machine Learning, Artificial Intelligence for NASA Data Challenges

Please see http://tinyurl.com/NASA-17ECF for listing of four topics, including "Big Data and Artificial Intelligence Solutions for NASA Data Challenges" (provided below for reference) as well as general program information, eligibility, proposal requirements, and other relevant information.

For questions, please e-mail hq-ecf-call at mail.nasa.gov<mailto:hq-ecf-call at mail.nasa.gov>.

Due dates:
NOIs (strongly encouraged): March 3, 2017 (5PM Eastern)
Proposals: March 31, 2017 (5PM Eastern, 2PM Pacific)

-----

Topic 4 - Big Data and Artificial Intelligence Solutions for NASA Data Challenges

The objective of this topic is to develop and apply advanced data science methodologies, such as predictive analytics and machine learning, to aid domain experts in discovering new insights for NASA science and exploration---called computer-aided insight generation.

NASA invests significant resources in collecting and storing two primary types of large, heterogeneous datasets from space exploration and science missions. The first type of data, referred to as science data, are data that directly relate to the science objective of the mission, such as Earth science data collected by satellite instruments. Other science data, as distinct from direct observations, could be predictions of physical phenomena from modeling and simulations. The second type of data, referred to as operations data, are data that represent the operational state and health of the spacecraft systems and instruments that support the collection of the science data.

NASA’s science data are designed to provide insight into physical processes and advance scientific disciplines. While traditional analysis approaches such as those focused on physics-based modeling have yielded transformative insights, science data may be under-utilized, particularly as the scale, complexity, and multi-disciplinary nature of the data and phenomena grow. NASA has substantial operations data from science missions that are often critical to perform the science reconstruction and generate science data products, but are even more valuable when they are transformed into useful insights and knowledge that can inform improvements for current and future missions. Manual and multi-disciplinary analyses of these data are becoming increasingly impractical due to the quintessential "Big Data" problem of rapidly growing data volume, variety (in data types, data rates, and other characteristics), and velocity (rate at which data is produced). Veracity of these data---the question of how accurate the data are---is also an important challenge to overcome for any confident insight generation.

Modern "Big Data" problems are not unique to NASA. Other organizations, including other government agencies and U.S. industry also have datasets for which manual methods are impractical. Common challenges include:

* Understanding the full data lifecycle
* Capturing and curating data from repositories that may not be well-architected to enable easy access to distributed, heterogeneous data
* Developing novel statistical approaches (or other approaches including physics-based modeling) for data analysis and other mechanisms for identifying and extracting interesting features and patterns
* Methodologies for validating results and comparing predictions to measurements, and visualizing massive datasets and results

Organizations are increasingly utilizing commercial as well as free or open-source computer software that implement machine learning and data mining algorithms to aid in analyzing large, complex datasets. Many organizations are extending these existing tools for their own needs. Universities are performing research and development to create new algorithms and methods in machine learning and data mining, among others. However, most developments have not been applied to NASA science and operations data nor have they been developed with issues unique to the NASA community in mind.

A variety of machine learning and other artificial intelligence technologies such as case-based reasoning and goal-oriented planning can learn from science or operations data and can generate predictions or classifications efficiently, such as for understanding long-term equipment health trends. NASA problems sometimes require combinations of science and operations data to explore trade-offs for current and future mission planning as well as to generate new insights for advancing the science discipline. Additionally, for NASA applications, it is valuable to have technologies that can be utilized by domain experts to discover useful insights. This requires data-driven methods that exhibit transparency, the ability to accept feedback, and utilization of existing domain knowledge in the form of physics-based models.

This solicitation topic specifically seeks innovative university research to develop computer-aided insight generation tools that can be applied to science and operational data of NASA science and exploration. Potential research focuses include, but are not limited to, computer-aided tools that:

* Produce new insights (as defined below) from NASA science and operations data, or combinations of NASA science and operations data
* Fuse physics-based and other traditional scientific modeling approaches with advanced data science methodologies such as predictive analytics, artificial intelligence, and machine learning approaches
New developments of computer-aided tools to generate insights from NASA science and operations data should consider the following features, at a minimum:
* Ability to collect and curate datasets (such as providing provenance metadata) through interaction with existing data repositories
* Ability to formulate and test hypotheses regarding data quality and anomalies from the combination of science and operations data
* Ability to scale analyses to large and heterogeneous datasets
* Ability to reveal, in a human-interpretable form, how decisions/insights are derived
* Ability to accept user feedback on the results (in the form of corrections and features that constitute the user's rationales for the corrections)
* Ability to provide validation, such as comparing models vs. measurements

Proposers selected for award are expected to demonstrate their tools on NASA science and operations data during the course of the award. To enable this, NASA domain experts will facilitate access to data and models necessary to pursue the research. Proposers selected for award are also expected to compare the performance of any novel methods developed with relevant existing machine learning and data mining tools, as appropriate. Proposers are encouraged to leverage open-source tools and engage in open-source communities of practice.
Please refer to Section 7---Points of Contact for Further Information of this Appendix if you have technical questions pertaining to this topic.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20170216/430b8fff/attachment.html>


More information about the Connectionists mailing list