From Teaching

BigData: Resources

How to

Datasets

Data.World https://data.world/

Data for Everyone https://www.crowdflower.com/data-for-everyone/

Crisis Text Line http://www.fastcompany.com/3056936/crisis-text-line-is-opening-its-treasure-trove-of-data-to-researchers

Yahoo datasets http://yahoolabs.tumblr.com/post/137281912191/yahoo-releases-the-largest-ever-machine-learning

Kaggle datasets https://www.kaggle.com/datasets

Movie sentiment analysis http://ai.stanford.edu/~amaas/data/sentiment/

Public data http://enigma.io/

80 million tiny images http://horatio.cs.nyu.edu/mit/tiny/data/

Google public data http://www.google.com/publicdata/directory

Stanford Large Network Dataset Collection http://snap.stanford.edu/data/index.html

Wikipedia database https://en.wikipedia.org/wiki/Wikipedia:Database_download

Open street map http://planet.openstreetmap.org/

Newsgroups http://qwone.com/~jason/20Newsgroups/

Social network analysis http://www.growmeme.com/snas

Data.GOV http://catalog.data.gov/dataset

Webscope http://webscope.sandbox.yahoo.com/catalog.php

Gene NCBI http://www.ncbi.nlm.nih.gov/gene

DrugBank (open data drug & drug target database) http://www.drugbank.ca/databases

UCI http://archive.ics.uci.edu/ml/datasets.html

Random hacks of kindness http://www.rhok.org/problems

Wireless networks: http://crawdad.cs.dartmouth.edu/about.php

Friendster Social Network Dataset: https://archive.org/details/friendster-dataset-201107

Amazon public datasets: http://aws.amazon.com/public-data-sets/

Book-Crossing Dataset: http://www2.informatik.uni-freiburg.de/~cziegler/BX/ (for recommender systems)

Guidelines for final report & poster

Final report and poster

Retrieved from http://www.cs.unm.edu/~estrada/teaching/trilce/index.php?n=BigData.Resources
Page last modified on January 17, 2017, at 07:29 PM EST