Open Science is Better Science

Week 1 of Data Science for NGA LTER REU Students

Liz Dobbins

NGA LTER / Axiom Data Science

2023-07-20

Goals: Why Are We Doing This?

  1. Learn some skills
  2. Build a structure which enables graceful recovery (scaffolding)
  3. Provide resources to use in the future (open a door)
  4. Contribute to the community

Who am I?

  • Liz Dobbins (she/her)
  • Used to work for Seth processing physical oceanographic data
  • Now I work for Axiom Data Science
  • Python, MATLAB, git, Jupyter, Pandas
  • The Troth Yeddha’ Campus is located on the ancestral lands of the Dena people of the lower Tanana River. Troth Yeddha’ means Indian potato ridge.
a map showing air temperature measurement in the Gulf of Alaska

AOOS Data Portal

Sad animals working alone are welcomed by a smiling fox to a land of bounty.

Credit: Allison Horst/Openscapes

Code of Conduct

  • Use welcoming and inclusive language
  • Be respectful of different viewpoints and experiences
  • Gracefully accept constructive criticism
  • Focus on what is best for the community
  • Show courtesy and respect towards other community members

From The Carpentries Code of Conduct

Tell Me About You

Open Science


Open Science is transparent and accessible knowledge that is shared and developed through collaborative networks.


Open Science now: A systematic literature review for an integrated definition. Ruben Vicente-Saez & Clara Martinez-Fuentes, 2018

Mistakes?

Science, my lad, is made up of mistakes, but they are mistakes which it is useful to make, because they lead little by little to the truth.

Jules Verne, A Journey to the Center of the Earth



Dammit!

Liz Dobbins, just yesterday

Scripting

  • You try something and it doesn’t work. What’s the easiest way to fix it?
  • It did work! How do you do it again?
  • Someone else wants to apply what you did to their work!!! How do you show them how?

Answer: Use Code.

Python, R, MATLAB doesn’t matter

import pandas as pd
data = pd.read_csv('temperature.csv')
data['site1'].plot()

What is Version Control?

A drawing of rock climbing monsters.

Credit: Allison Horst

Data Life Cycle

Data Life Cycle. DataONE Best Practices

Data Best Practices

  • We receive data from other people
  • We supply data to other people
  • Data need to be
    • High quality
    • In an accessible format
    • Accompanied by metadata

Potential Team Project

  • Use Quarto and GitHub and Markdown to reflect on your REU
  • Content/Form of the document is limitless
  • Get a DOI (Digital Object Identifier) for later use
    • Graduate school applications
    • Job applications
  • Tour of this Quarto project