<div dir="ltr">DEEM'17<br>The 1st Workshop on Data Management for End-to-End Machine Learning, May 14, 2017.<br><a href="http://deem-workshop.org" target="_blank">http://deem-workshop.org</a><br><a href="https://twitter.com/deem_workshop" target="_blank">https://twitter.com/deem_works<wbr>hop</a><br><br>Held in conjunction with ACM SIGMOD 2017<br>Raleigh, NC, USA, May 14-19, 2017<br><a href="http://sigmod2017.org/" target="_blank">http://sigmod2017.org/</a><br><br>----------<br> WORKSHOP<br>----------<br><br>Applying

 Machine Learning (ML) in real-world scenarios is a challenging task. In

 recent years, the main focus of the database community has been on 

creating systems and abstractions for the efficient training of ML 

models on large datasets. However, model training is only one of many 

steps in an end-to-end ML application, and a number of orthogonal data 

management problems arise from the large-scale use of ML, which require 

the attention of the data management community. <br><br>Therefore, DEEM 

aims to bring together researchers and practitioners at the intersection

 of applied machine learning, data management, and systems research, 

with the goal to discuss the arising data management issues in ML 

application scenarios. The workshop solicits *regular research papers 

describing preliminary and ongoing research results*. In addition, the 

workshop encourages the submission of *industrial experience reports of 

end-to-end ML deployments*. Submissions can either be *short papers (4 

pages)* or *long papers (up to 10 pages)* following the ACM proceedings 

format, as described in <a href="https://www.acm.org/publications/proceedings-template" target="_blank">https://www.acm.org/publicatio<wbr>ns/proceedings-template</a>.<br><br>Examples of data management problems in ML are as follows:<br><br> - Simultaneously executing relational and linear algebraic operations in data preprocessing and feature extraction<br> - Choosing among popular classes of ML models (linear models, decision trees, and deep neural networks)<br> - Executing costly offline evaluation processes for choosing features and hyperparameters<br> - Deployment of models and integration into existing business workflows<br> - Fast and Efficient Online Predictions from trained ML Models<br><br>Areas of particular interest for the workshop include (but are not limited to):<br><br> - Data Management in Machine Learning Applications<br> - Definition, Execution, and Optimization of Complex ML Pipelines<br> - Systems for Managing the Lifecycle of Machine Learning Models<br> - Systems for Efficient Hyperparameter Search and Feature Selection<br> - Machine Learning Services in the Cloud<br> - Modeling, Storage, and Lineage of ML experimentation data<br> - Integration of Machine Learning and Dataflow Systems<br> - Integration of Machine Learning and ETL Processing<br> - Benchmarking of Machine Learning Applications<br> - Definition and Execution of Complex Ensemble Predictors<br> - Architectures for Streaming Machine Learning<br><br>----------------<br>IMPORTANT DATES<br>----------------<br><br>Papers submission deadline:             February 1, 2017<br>Authors notification:                 <wbr>  March 1, 2017<br>Deadline for camera-ready copy:         March 20, 2017<br>Workshop:                     <wbr>          Sunday May 14th, 2017<br><br>----------------------<br>SUBMISSION GUIDELINES<br>----------------------<br><br>The

 workshop will have two tracks for regular research papers (including 

research in progress) and industrial papers (e.g., industrial experience

 reports of end-to-end ML deployments). Submissions can either be *short

 papers (4 pages)* or *long papers (up to 10 pages)* following the ACM 

proceedings format, as described in <a href="https://www.acm.org/publications/proceedings-template" target="_blank">https://www.acm.org/publicatio<wbr>ns/proceedings-template</a>.<br><br>----------------<br>PUBLICATION<br>----------------<br><br>The workshop proceedings will be published in ACM DL and the organizers will prepare a SIGMOD Record report.<br><br>---------------------------<br> ORGANIZERS<br>---------------------------<br><br> - Sebastian Schelter (Amazon)<br> - Reza Zadeh (Stanford & Matroid)<br> - Markus Weimer (Microsoft)<br> - Rajeev Rastogi (Amazon)<br> - Volker Markl (TU Berlin)<br><br>---------------------------<br> PROGRAM COMMITTEE<br>---------------------------<br><br> - Sunita Sarawagi (IIT Bombay)<br> - Sudip Roy (Google)<br> - Rainer Gemulla (University of Mannheim)<br> - Matthias Boehm (IBM Research)<br> - Matthias Seeger (Amazon)<br> - Evan Sparks (UC Berkeley)<br> - Chris RÃ© (Stanford)<br> - Ted Dunning (MapR Technologies)<br> - Dionysios Logothetis (Facebook)<br> - Nedelina Teneva (University of Chicago)<br> - Vasia Kalavri (KTH Stockholm)<br> - Venu Satuluri (Twitter)<br> - Shannon Quinn (University of Georgia)<br> - Dmitriy Lyubimov (Apache Mahout)<br> - Tilmann Rabl (TU Berlin)<br> - Max Heimel (Snowflake)<br> - Felix Biessmann (Amazon)<br> - Arun Kumar (UC San Diego) <br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div>Nedelina Teneva</div><div><br></div></div>

</div>