Principal Data Engineer(NY/NJ)

United States, NY, New York
Regular
8/28/2023
535096

Medidata: Powering Smarter Treatments and Healthier People

Medidata, a Dassault Systèmes company, is leading the digital transformation of life sciences, creating hope for millions of people. Medidata helps generate the evidence and insights to help pharmaceutical, biotech, medical device and diagnostics companies, and academic researchers accelerate value, minimize risk, and optimize outcomes. More than one million registered users across 2,000+ customers and partners access the world's most trusted platform for clinical development, commercial, and real-world data. Known for its ground-breaking technological innovations, Medidata has supported more than 30,000 clinical trials and 9 million study participants. And Medidata’s ongoing commitment to infusing the patient voice into trial designs and solutions is helping to create a better and more inclusive experience for all participants in clinical studies. Medidata is involved in nearly 40% of company-initiated trial starts globally, with studies conducted in more than 140 countries. More than 70% of novel drugs approved by the Food and Drug Administration (FDA) in 2022 were developed with Medidata software. Medidata is headquartered in New York City and has offices around the world to meet the needs of its customers. Discover more at www.medidata.com and follow us @medidata.

Technical Services at Medidata  is seeking a Principal Data Engineer  to develop high-quality data engineering services and tools for our clients. The ideal candidate will have a deep understanding of data management in the context of clinical trials, deep working experience in software development, infrastructure, cloud, data engineering, and experience developing tools for data reconciliation across different data sources and modalities.

Our technology is in an industry that is improving the lives of millions. Collect our client’s data from different modalities into our platform and make them available wherever and whenever customers need it, safely and reliably. There is no room for error. If you are looking for a better place to use your passion and your desire to drive change, this is the place to be. As part of the Technical Services team, the successful candidate will play a significant part in being a data engineer/data management trusted advisor across all clients.

You will be supporting the technology data and analytics to improve stakeholder engagement, idea intake, roadmap management and optimal data related to product performance while ensuring execution and delivery transparency. In this role you will actively contribute and influence how ideas and strategy become a reality; directly impacting clinical and business practices. You will make a difference solving the impossible.

Position in this function is responsible for the full data lifecycle management, including design, prioritization, development cycle management for end-of-life, new, existing and acquisition products. 

Leverage market insights to understand market/customer needs to identify new opportunities or make adjustments to current TS data offerings. The key to a successful offering is to ask why, to use frameworks and models building solid relationships across business and technology. 

The ideal candidate will have the right balance between data strategy, data engineering and execution.

The Principal Data Engineer assists the VP of Technical Services to define and execute the data strategy within Technical Services and is responsible for the growth goals the data of a
related product portfolio achieves annually.

Primary Responsibilities:
- Data vision and roadmap. Creates, maintains, and prioritizes backlog.
Ensures stakeholders are informed.
- Cross-functional stakeholder management to include but not limited to
design, engineering, scrum teams, business leaders, etc., functioning as
technical expert with years of experience in development concepts and
modern development practices.
- Develop data as a product Business Cases to support the planning.
- Effectively presents product data opportunities in leadership discussions with
the VP of Technical Services.
- Develop and implements data engineering strategies that align with
Medidata's business goals.
- Develop tools and strategies for data reconciliation across different data
sources and modalities.
- Ensure data engineering processes are efficient, effective, and compliant with
regulatory requirements.
- Provide technical leadership and expertise in software development,
infrastructure teams, cloud, data engineering, and data reconciliation.
- Identify opportunities for process improvement and drives the adoption of
new technologies and best practices.
- Develop and maintains digital health data pipelines.
- Implement ETL processes focusing on efficiency and reliability.
- Work together with the data scientists to help define healthcare data
ingestion solutions.
- Learn new technologies and their optimal application to our context.
- Be an active interface with other R&D Groups.
- Help to find the best dataset for specific business projects.
- Communicate in a clear and concise way using the most appropriate approach for each different stakeholder.
- Prepare the data for machine learning, feature selection, evaluate the
performance of algorithms with resampling, machine learning algorithm
performance metrics, familiar with classification and regression, automate
machine learning workflows with algorithms, improve performance with
algorithm tuning and ensembles.
- Understand the data with descriptive statistics.
- Understand the data with visualizations.
- Work with raw or aggregated data (statistics) for development and innovation
purposes.
 - Properly anonymize data sets that can be extracted and used in variety of
use-cases.
- Integrate data sets (Building the capabilities of embedding of real-world data, synthetic controls into clinical development programs).
- Utilize multiple data collection modalities such as eCOA, Medication adherence devices, Wearable, Sensors.
- Familiar with data handling strategies to: address illogical data directly from the source and how to "disqualify and/or flag" implausible data, identify and evaluate: data outliers, data trends such as the range, consistency, and data variability within and across sites, systematic or significant errors in data collection, potential data manipulation or data integrity problems.
- Pattern recognition & process optimization(New data sciences methods to plan, predict and manage the risks in clinical trials based on both RCT &
RWD).
- Drive the development of tools extracting meaningful insights to detect potentially unreliable data threatening the validity of the trial results.
- High level understanding of Artificial Intelligence methods and scope of applicability
- Strong project management skills and experience managing cross-functional teams
- Excellent communication skills and the ability to work effectively with both technical and non-technical stakeholders
- Knowledge of regulatory requirements related to data management in clinical trials, including FDA guidance documents and industry standards such asCDISC
- Pushing for new ways to use data engineering in data management, with experience in technological adoption
- Guiding regarding software applications that manage data.
- 5+ years of demonstrable experience on defining data strategy, roadmap, requirements and translate into artifacts that drive data management. This includes market evaluation, persona development, capability and journey mapping
- 5+ years' experience in defining the product vision, goals, benefits and then track KPI's to measure success/outcomes
- The candidate must understand good clinical practices (GCP), protocol, protocol deviations, metadata, basic SDTM mapping, programming
- Having prior Sponsor/CRO experience you will be a proactive advisor to let clients know what is possible, based on the Medidata products and platform
capabilities communicating effectively with the client organizations recommending solutions
- The key role is that the candidate should be aware of how study data should be processed, how systems should be configured and validated to get true data that can fit the protocol endpoints.
- Strong communication and planning skills
- Work more independently and autonomously
- Represent the culture of performance, collaboration and opportunity

Requirements:
- 5+ years of experience in a leadership role within a sponsor or CRO organization
- Expertise in software development, infrastructure, cloud, data engineering, and data reconciliation
- Strong project management skills and experience managing cross-functional teams
- Excellent communication skills and the ability to work effectively with both technical and non-technical stakeholders
- Knowledge of regulatory requirements related to data management in clinical trials, including FDA guidance documents and industry standards such as CDISC
- Pushing for new ways to use data engineering in data management, with experience in technological adoption 
- Guiding regarding software applications that manage data.
- 5+ years of demonstrable experience on defining data strategy, roadmap, requirements and translate into artifacts that drive data management. This includes market evaluation, persona development, capability and journey mapping
- 5+ years’ experience in defining the product vision, goals, benefits and then track KPI’s to measure success/outcomes
- The candidate must understand good clinical practices (GCP), protocol, protocol deviations, metadata, basic SDTM mapping, programming 
- Having prior Sponsor/CRO experience you will be a proactive advisor to let clients know what is possible, based on the Medidata products and platform capabilities communicating effectively with the client organizations recommending solutions
- The key role is that the candidate should be aware of how study data should be processed, how systems should be configured and validated to get true data that can fit the protocol endpoints. 
- Strong communication and planning skills
- Work more independently and autonomously
- Represent the culture of performance, collaboration and opportunity.

Who You Are:

- M.S in computer science, mathematics, engineering or statistics with longterm experience, ideally in the pharmaceutical industry
- Experience in data modeling, wrangling and visualization
- Strong knowledge of SQL, NoSQL solutions, Python and the most common data management libraries (Pandas, PySpark)
- Good knowledge of AWS with a specific focus on data oriented services
- Knowledge of the most common data management platforms as well as knowledge of data streaming frameworks, Hadoop, Spark, Hive and other big data frameworks
- Like to develop unique patient centric and data driven solutions to challenges in clinical trials today
- Experience with methodology for providing virtual or synthetic controls to clinical trials based on RWD and legacy clinical trial data
- Passion to simplify patient and site journey in decentralized elements implementation
- Deep knowledge of data including the characteristics of different types of data, such as EHR, ePRO, medically validated devices, imaging, clinical data
- Understand the implications of data context, quality, source, amount, andworkflow
- Advanced analytical and technical skills to interrogate for mining high volumes of data from a variety of data sources.
- Attention to detail
- Therapeutic area knowledge
- Communication skills in articulating complex data findings to business teams
- Systematic data review and trending
- Design of data collection tools
- Strong analytical skills, team player and communication skills
- Ability to work in teams
- Fluent in English, both written and spoken

As with all roles, Medidata sets ranges based on a number of factors including function, level, candidate expertise and experience, and geographic location.

The salary range for positions that will be physically based in the NYC Metro Area is $184,500-246,000.

Base pay is one part of the Total Rewards that Medidata provides to compensate and recognize employees for their work. Most sales positions are eligible for a commission on the terms of applicable plan documents, and many of Medidata's non-sales positions are eligible for annual bonuses. Medidata believes that benefits should connect you to the support you need when it matters most and provides best-in-class benefits, including medical, dental, life and disability insurance; 401(k) matching; unlimited paid time off;and 10 paid holidays per year

#LI-TC1

#LI-Hybrid

Diversity statement

As a game-changer in sustainable technology and innovation, Dassault Systèmes is striving to build more inclusive and diverse teams across the globe. We believe that our people are our number one asset and we want all employees to feel empowered to bring their whole selves to work every day. It is our goal that our people feel a sense of pride and a passion for belonging. As a company leading change, it’s our responsibility to foster opportunities for all people to participate in a harmonized Workforce of the Future.

Equal opportunity

In order to provide equal employment and advancement opportunities to all individuals, employment decisions at 3DS are based on merit, qualifications and abilities. 3DS is committed to a policy of non-discrimination and equal opportunity for all employees and qualified applicants without regard to race, color, religion, gender, sex (including pregnancy, childbirth or medical or common conditions related to pregnancy or childbirth), sexual orientation, gender identity, gender expression, marital status, familial status, national origin, ancestry, age (40 and above), disability, veteran status, military service, application for military service, genetic information, receipt of free medical care, or any other characteristic protected under applicable law. 3DS will make reasonable accommodations for qualified individuals with known disabilities, in accordance with applicable law.
MEDIDATA Logo > Dassault Systèmes

MEDIDATA generates the evidence and insights to help pharmaceutical, biotech, medical device and diagnostics companies, and academic researchers accelerate value, minimize risk, and optimize outcomes.