Monday, November 11, 2019

Data Analyst/SAS Programmer – D.C. Metro Area

Cadence Group is seeking a part-time Data Analyst/SAS Programmer (30 hours per week) for a contract position with a federal agency in the Washington, DC metro area. The primary duties of the position will consist of record linkage activities involving the identification, standardization, and preparation of survey data necessary for record linkage; file matching; the development and application of probabilistic matching techniques; interpretation of linkage results; development of data update procedures; and developing specifications and related documentation of restricted and public use linked data files.
RESPONSIBILITIES:
  • Using SAS, Stata, and R to conduct QC and maintain documentation for public-use health survey data.
  • Working with analysts to create, update, and modify analytic guidelines, reports, and information on the data linkage website for linked health survey data.
  • Conducting data quality evaluations and documenting the results.
  • Creating record linkage submission files. This involves recoding and standardizing survey data following prescribed file and variable format requirements and identifying records that are ineligible for linkage following guidance from staff.
  • Applying existing or developing new scoring algorithms, matching to auxiliary data sources, and analyzing all match results. Other data sources may be required to be cross-analyzed with linkage results.
  • Applying or enhancing standard submission program routines that create alternate record submissions when certain data characteristics are identified such as replacing nicknames with proper names, etc.
  • Learning the technical aspects of the data linkage warehouse, including the processes for updating information obtained from a variety of sources.
  • Preparing restricted-use, public-use, and synthetic data files. This task will require the programmer to maintain a detailed knowledge of the variables and file format of a variety of large and complex external linked files.
  • Working with staff to develop new analytic variables combining multiple sources of similar data that are useful for epidemiologic research and suitable for research use.
  • Producing and delivering up to 10 large data-linked files per year. These files will be large, with potentially 500-2000 variables each and several million records each. The files will at times need to be subset into survey specific files resulting in approximately 30 files per master linked file.
  • Assisting staff in conducting disclosure review, which includes rematching preliminary data files to outside data sources to attempt re-identification at the person level.
  • Providing programming support for the process of loading and maintenance of the automated tracking system and the SAS-based tracking master files.
  • Reviewing and evaluating potentially matched records for a variety of record linkage activities.
  • Producing, with SAS, requested frequencies and cross tabulations, and other descriptive statistics.
  • Producing documentation for all files produced and programs written. Documentation shall include commenting on the program code as well as providing external descriptions of files and programs.
  • Maintaining a bibliographic database of published articles using linked data products.
REQUIREMENTS:
  • US citizenship
  • Experience writing and maintaining documentation for data sets and data management programs, including specifications, procedures, and metadata
  • Advanced data management skills using SAS and SQL. Proficiency in R and/or Stata a plus
  • Experience matching data files stored in a variety of formats, including SAS, Excel, ASCII
  • Familiarity with client disclosure avoidance practices and guidelines on safe storage of identifiable data
  • Ability to link files, identify and remove duplicates, and create population subgroups and/or output files
Cadence Group is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, or protected veteran status.
U.S. citizenship is required. SAS programming skills are a must; familiarity with R and Stata are a plus.