You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Peter Bull
173a18ac88
|
9 years ago | |
---|---|---|
{{ cookiecutter.repo_name }} | 9 years ago | |
.gitattributes | 9 years ago | |
README.md | 9 years ago | |
cookiecutter.json | 9 years ago |
README.md
cookiecutter-data-science
An opinionated, but not-afraid-to-be-wrong project template for data science projects. Pull requests welcome. Debate encouraged.
Requirements to create project:
- Python 2.7 or 3.5
- cookiecutter Python package
To start a new project:
cookiecutter [email protected]:drivendata/cookiecutter-data-science.git
Data
** By default, the data
folder is included in the .gitignore
file.** If you have a small amount of data that rarely changes, you may want to include the data in the repository. Github currently warns if files are over 50MB and rejects files over 100MB. Some other options for storing large data include AWS S3 with a syncing tool (e.g., s3cmd
), Git Large File Storage, Git Annex, and dat.
The prefered workflow if data is not in the repository is to have a make command make data
that will download or create the relevant datasets.