Treasure for Data Science blogs (A to Z)

This blog will help you in knowledge hunt of Data science. The below given list will help you to find the blogs that talk about Data science easily. I hope you will find this useful.

A Blog From a Human-engineer-being http://www.erogol.com/ 
Aakash Japi http://aakashjapi.com/ 
Adit Deshpande https://adeshpande3.github.io/ 
Advanced Analytics & R http://advanceddataanalytics.net/ 
Adventures in Data Land http://blog.smola.org 
Agile Data Science http://blog.sense.io/ 
Ahmed El Deeb https://medium.com/@D33B 
Airbnb Data blog http://nerds.airbnb.com/data/ 
Alex Castrounis | InnoArchiTech http://www.innoarchitech.com/ 
Alex Perrier http://alexperrier.github.io/ 
Algobeans | Data Analytics Tutorials & Experiments for the Layman https://algobeans.com 
Amazon AWS AI Blog https://aws.amazon.com/blogs/ai/ 
Analytics Vidhya http://www.analyticsvidhya.com/blog/ 
Analytics and Visualization in Big Data @ Sicara https://blog.sicara.com 
Andreas Müller http://peekaboo-vision.blogspot.com/ 
Andrej Karpathy blog http://karpathy.github.io/ 
Andrew Brooks http://brooksandrew.github.io/simpleblog/ 
Andrey Kurenkov http://www.andreykurenkov.com/writing/ 
Anton Lebedevich’s Blog http://mabrek.github.io/ 
Arthur Juliani https://medium.com/@awjuliani 
Audun M. Øygard http://www.auduno.com/ 
Avi Singh https://avisingh599.github.io/ 
Beautiful Data http://beautifuldata.net/ 
Beckerfuffle http://mdbecker.github.io/ 
Becoming A Data Scientist http://www.becomingadatascientist.com/ 
Ben Bolte’s Blog http://benjaminbolte.com/ml/ 
Ben Frederickson http://www.benfrederickson.com/blog/ 
Berkeley AI Research http://bair.berkeley.edu/blog/ 
Big-Ish Data http://bigishdata.com/ 
Blog on neural networks http://yerevann.github.io/ 
Blogistic RegressionAbout Projects http://d10genes.github.io/blog/ 
blogR | R tips and tricks from a scientist https://drsimonj.svbtle.com/ 
Brain of mat kelcey http://matpalm.com/blog/ 
Brilliantly wrong thoughts on science and programming https://arogozhnikov.github.io/ 
Bugra Akyildiz http://bugra.github.io/ 
Building Babylon https://building-babylon.net/ 
Carl Shan http://carlshan.com/ 
Chris Stucchio https://www.chrisstucchio.com/blog/index.html 
Christophe Bourguignat https://medium.com/@chris_bour 
Christopher Nguyen https://medium.com/@ctn 
Cloudera Data Science Posts http://blog.cloudera.com/blog/category/data-science/ 
colah’s blog http://colah.github.io/archive.html 
Cortana Intelligence and Machine Learning Blog https://blogs.technet.microsoft.com/machinelearning/ 
Daniel Forsyth http://www.danielforsyth.me/ 
Daniel Homola http://danielhomola.com/category/blog/ 
Daniel Nee http://danielnee.com 
Data Based Inventions http://datalab.lu/ 
Data Blogger https://www.data-blogger.com/ 
Data Labs http://blog.insightdatalabs.com/ 
Data Meets Media http://datameetsmedia.com/ 
Data Miners Blog http://blog.data-miners.com/ 
Data Mining Research http://www.dataminingblog.com/ 
Data Mining: Text Mining, Visualization and Social Media http://datamining.typepad.com/data_mining/ 
Data Piques http://blog.ethanrosenthal.com/ 
Data School http://www.dataschool.io/ 
Data Science 101 http://101.datascience.community/ 
Data Science @ Facebook https://research.facebook.com/blog/datascience/ 
Data Science Insights http://www.datasciencebowl.com/data-science-insights/ 
Data Science Tutorials https://codementor.io/data-science/tutorial 
Data Science Vademecum http://datasciencevademecum.wordpress.com/ 
Dataaspirant http://dataaspirant.com/ 
Dataclysm http://blog.okcupid.com/ 
DataGenetics http://datagenetics.com/blog.html 
Dataiku https://www.dataiku.com/blog/ 
DataKind http://www.datakind.org/blog 
DataLook http://blog.datalook.io/ 
Datanice https://datanice.wordpress.com/ 
Dataquest Blog https://www.dataquest.io/blog/ 
DataRobot http://www.datarobot.com/blog/ 
Datascope http://datascopeanalytics.com/blog 
DatasFrame http://tomaugspurger.github.io/ 
David Mimno http://www.mimno.org/ 
Dayne Batten http://daynebatten.com 
Deep Learning http://deeplearning.net/blog/ 
Deepdish http://deepdish.io/ 
Delip Rao http://deliprao.com/ 
DENNY’S BLOG http://blog.dennybritz.com/ 
Dimensionless https://dimensionless.in/blog/ 
Distill http://distill.pub/ 
District Data Labs http://districtdatalabs.silvrback.com/ 
Diving into data https://blog.datadive.net/ 
Domino Data Lab’s blog http://blog.dominodatalab.com/ 
Dr. Randal S. Olson http://www.randalolson.com/blog/ 
Drew Conway https://medium.com/@drewconway 
Dustin Tran http://dustintran.com/blog/ 
Eder Santana https://edersantana.github.io/blog.html 
Edwin Chen http://blog.echen.me 
EFavDB http://efavdb.com/ 
Emilio Ferrara, Ph.D. http://www.emilio.ferrara.name/ 
Entrepreneurial Geekiness http://ianozsvald.com/ 
Eric Jonas http://ericjonas.com/archives.html 
Eric Siegel http://www.predictiveanalyticsworld.com/blog 
Erik Bern http://erikbern.com 
ERIN SHELLMAN http://www.erinshellman.com/ 
Eugenio Culurciello http://culurciello.github.io/ 
Fabian Pedregosa http://fa.bianp.net/ 
Fast Forward Labs http://blog.fastforwardlabs.com/ 
FastML http://fastml.com/ 
Florian Hartl http://florianhartl.com/ 
FlowingData http://flowingdata.com/ 
Full Stack ML http://fullstackml.com/ 
GAB41 http://www.lab41.org/gab41/ 
Garbled Notes http://www.chioka.in/ 
Greg Reda http://www.gregreda.com/blog/ 
Hyon S Chu https://medium.com/@adailyventure 
i am trask http://iamtrask.github.io/ 
I Quant NY http://iquantny.tumblr.com/ 
inFERENCe http://www.inference.vc/ 
Insight Data Science https://blog.insightdatascience.com/ 
INSPIRATION INFORMATION http://myinspirationinformation.com/ 
Ira Korshunova http://irakorshunova.github.io/ 
I’m a bandit https://blogs.princeton.edu/imabandit/ 
Jason Toy http://www.jtoy.net/ 
Jeremy D. Jackson, PhD http://www.jeremydjacksonphd.com/ 
Jesse Steinweg-Woods https://jessesw.com/ 
Joe Cauteruccio http://www.joecjr.com/ 
John Myles White http://www.johnmyleswhite.com/ 
John’s Soapbox http://joschu.github.io/ 
Jonas Degrave http://317070.github.io/ 
Joy Of Data http://www.joyofdata.de/blog/ 
Julia Evans http://jvns.ca/ 
KDnuggets http://www.kdnuggets.com/ 
Keeping Up With The Latest Techniques http://colinpriest.com/ 
Kenny Bastani http://www.kennybastani.com/ 
Kevin Davenport http://kldavenport.com/ 
kevin frans http://kvfrans.com/ 
korbonits | Math ∩ Data http://korbonits.github.io/ 
Large Scale Machine Learning http://bickson.blogspot.com/ 
LATERAL BLOG https://blog.lateral.io/ 
Lazy Programmer http://lazyprogrammer.me/ 
Learn Analytics Here https://learnanalyticshere.wordpress.com/ 
LearnDataSci http://www.learndatasci.com/ 
Learning With Data http://learningwithdata.com/ 
Life, Language, Learning http://daoudclarke.github.io/ 
Locke Data https://itsalocke.com/blog/ 
Louis Dorard http://www.louisdorard.com/blog/ 
M.E.Driscoll http://medriscoll.com/ 
Machinalis http://www.machinalis.com/blog 
Machine Learning (Theory) http://hunch.net/ 
Machine Learning and Data Science http://alexhwoods.com/blog/ 
Machine Learning https://charlesmartin14.wordpress.com/ 
Machine Learning Mastery http://machinelearningmastery.com/blog/ 
Machine Learning Blogs https://machinelearningblogs.com/ 
Machine Learning, etc http://yaroslavvb.blogspot.com 
Machine Learning, Maths and Physics https://mlopezm.wordpress.com/ 
Machined Learnings http://www.machinedlearnings.com/ 
MAPPING BABEL https://jack-clark.net/ 
MAPR Blog https://www.mapr.com/blog 
MAREK REI http://www.marekrei.com/blog/ 
MARGINALLY INTERESTING http://blog.mikiobraun.de/ 
Math ∩ Programming http://jeremykun.com/ 
Matthew Rocklin http://matthewrocklin.com/blog/ 
Melody Wolk http://melodywolk.com/projects/ 
Mic Farris http://www.micfarris.com/ 
Mike Tyka http://mtyka.github.io/ 
minimaxir | Max Woolf’s Blog http://minimaxir.com/ 
Mirror Image https://mirror2image.wordpress.com/ 
Mitch Crowe http://www.dataphoric.com/ 
MLWave http://mlwave.com/ 
MLWhiz http://mlwhiz.com/ 
Models are illuminating and wrong https://peadarcoyle.wordpress.com/ 
Moody Rd http://blog.mrtz.org/ 
Moonshots http://jxieeducation.com/ 
Mourad Mourafiq http://mourafiq.com/ 
My thoughts on Data science, predictive analytics, Python http://shahramabyari.com/ 
Natural language processing blog http://nlpers.blogspot.fr/ 
Neil Lawrence http://inverseprobability.com/blog.html 
NLP and Deep Learning enthusiast http://camron.xyz/ 
no free hunch http://blog.kaggle.com/ 
Nuit Blanche http://nuit-blanche.blogspot.com/ 
Number 2147483647 https://no2147483647.wordpress.com/ 
On Machine Intelligence https://aimatters.wordpress.com/ 
Opiate for the masses Data is our religion. http://opiateforthemass.es/ 
p-value.info http://www.p-value.info/ 
Pete Warden’s blog http://petewarden.com/ 
Plotly Blog http://blog.plot.ly/ 
Probably Overthinking It http://allendowney.blogspot.ca/ 
Prooffreader.com http://www.prooffreader.com 
ProoffreaderPlus http://prooffreaderplus.blogspot.ca/ 
Publishable Stuff http://www.sumsar.net/ 
PyImageSearch http://www.pyimagesearch.com/ 
Pythonic Perambulations https://jakevdp.github.io/ 
quintuitive http://quintuitive.com/ 
R and Data Mining https://rdatamining.wordpress.com/ 
R-bloggers http://www.r-bloggers.com/ 
R2RT http://r2rt.com/ 
Ramiro Gómez http://ramiro.org/notebooks/ 
Random notes on Computer Science, Mathematics and Software Engineering http://barmaley-exe.github.io/ 
Randy Zwitch http://randyzwitch.com/ 
RaRe Technologies http://rare-technologies.com/blog/ 
Rayli.Net http://rayli.net/blog/ 
Revolutions http://blog.revolutionanalytics.com/ 
Rinu Boney http://rinuboney.github.io/ 
RNDuja Blog http://rnduja.github.io/ 
Robert Chang https://medium.com/@rchang 
Rocket-Powered Data Science http://rocketdatascience.org 
Sachin Joglekar’s blog https://codesachin.wordpress.com/ 
samim https://medium.com/@samim 
Sean J. Taylor http://seanjtaylor.com/ 
Sebastian Raschka http://sebastianraschka.com/blog/index.html 
Sebastian Ruder http://sebastianruder.com/ 
Sebastian’s slow blog http://www.nowozin.net/sebastian/blog/ 
SFL Scientific Blog https://sflscientific.com/blog/ 
Shakir’s Machine Learning Blog http://blog.shakirm.com/ 
Simply Statistics http://simplystatistics.org 
Springboard Blog http://springboard.com/blog
Startup.ML Blog http://startup.ml/blog 
Statistical Modeling, Causal Inference, and Social Science http://andrewgelman.com/ 
Stigler Diet http://stiglerdiet.com/ 
Stitch Fix Tech Blog http://multithreaded.stitchfix.com/blog/ 
Storytelling with Statistics on Quora http://datastories.quora.com/ 
StreamHacker http://streamhacker.com/ 
Subconscious Musings http://blogs.sas.com/content/subconsciousmusings/ 
Swan Intelligence http://swanintelligence.com/ 
TechnoCalifornia http://technocalifornia.blogspot.se/ 
TEXT ANALYSIS BLOG | AYLIEN http://blog.aylien.com/ 
The Angry Statistician http://angrystatistician.blogspot.com/ 
The Clever Machine https://theclevermachine.wordpress.com/ 
The Data Camp Blog https://www.datacamp.com/community/blog 
The Data Incubator http://blog.thedataincubator.com/ 
The Data Science Lab https://datasciencelab.wordpress.com/ 
THE ETZ-FILES http://alexanderetz.com/ 
The Science of Data http://www.martingoodson.com 
The Shape of Data https://shapeofdata.wordpress.com 
The unofficial Google data science Blog http://www.unofficialgoogledatascience.com/ 
Tim Dettmers http://timdettmers.com/ 
Tombone’s Computer Vision Blog http://www.computervisionblog.com/ 
Tommy Blanchard http://tommyblanchard.com/category/projects 
Trevor Stephens http://trevorstephens.com/ 
Trey Causey http://treycausey.com/ 
UW Data Science Blog http://datasciencedegree.wisconsin.edu/blog/ 
Wellecks http://wellecks.wordpress.com/ 
Wes McKinney http://wesmckinney.com/archives.html 
While My MCMC Gently Samples http://twiecki.github.io/ 
WildML http://www.wildml.com/ 
Will do stuff for stuff http://rinzewind.org/blog-en 
Will wolf http://willwolf.io/ 
WILL’S NOISE http://www.willmcginnis.com/ 
William Lyon http://www.lyonwj.com/ 
Win-Vector Blog http://www.win-vector.com/blog/ 
Yanir Seroussi http://yanirseroussi.com/ 
Zac Stewart http://zacstewart.com/ 
ŷhat http://blog.yhat.com/ 
ℚuantitative √ourney http://outlace.com/ 
大トロ http://blog.otoro.net/ 

Data Engineer vs Data Scientist (Infographic)

This Infographic will assist us to understand better about the skills and responsibilities of Data Engineer and Data Scientist. Also, it helps us to compare salaries, popular software and tools used by each. Hope this helps!

data-engineer-vs-data-scientist

Statistical Analysis in MS Excel using KADD Stat!

Most of us know that we can do few statistical analysis using ‘Analysis Tool pack’ addin in Microsoft Excel. We can also do more than that using KADD STAT.

KADD STAT is an add-in which comes for free of cost and easy to use. Mostly all the versions after Excel 2003 would support. So, students and other people who want to use some basic statistical stuff can utilize this awesome one.

Download KADD here https://kelley.iu.edu/mabert/e730/KADD.xla

How to Install the New Version of KADD

Also see video here

Option A

If KADD is currently not stored on your computer, then use the following steps:

  1. Download KADD into a folder on your hard drive
  2. Open up Excel
  3. Go to the Tools option and click on Add-Ins
  4. Click on the Browse button (on the right side of the dialog box) and go to the folder with KADD
  5. Double-Click on KADD
  6. KADDSTAT 3.02 will now show up as an Add-In option. Check it and press OK
  7. KADD will then show up as a menu option across the top. You may have to close Excel and then open it up again before you see KADD in Excel across the top

Option B

IF you have already installed the older version of KADD, then use the following steps to upgrade it to the new version:

  1. Download KADD to your hard drive
  2. Open up Excel
  3. Go to the Tools option and click on Add-Ins. KADD (the older version) will be an option in Add-In menu. Remove it by clicking off the check mark and shut down Excel
  4. Now open up Excel again and follow Steps 3 through 7 from Option A

Ready to rock?

A Glance at the available list of analysis in the add-in

2017-02-16_15h37_22
List of analysis

You can calculate probability values

2017-02-16_15h37_55
Probabilities
  1. Find confidence Intervals in different scenarios
2017-02-16_15h38_10
Confidence Intervals

2. Plot a Normal curve

3. Plot ‘Box plots’ for your data

4. Find out minimal sample size for different scenarios

2017-02-16_15h38_22
Sample Size

5. Perform Hypothesis testing

2017-02-16_15h38_32
Hypothesis Testing

6. Draw different quality control charts and find out process capability for normal data

2017-02-16_15h38_40
Quality Control

7. Find out correlation and regression 

2017-02-16_15h38_49
Regression and Correlation

8. Become a forecasting pro ! 🙂

2017-02-16_15h38_57
Forecasting

9. Some financial calculations

2017-02-16_15h39_04
Risk and Return

10. Find out Expected value and variability

11. Perform Decision Trees 

12. Linear Programming

2017-02-16_15h39_13
Reference Tables

Doing this is simple if you have data and know what to do. Give it a try and enjoy ! Happy learning…

10 famous TV shows related to Data science & AI (Artificial Intelligence)

“If you want to become one, first get inspired by one”

There is always few interesting ways to learn things and get inspire. Would you like to know few TV shows which are based on Data science and Artificial intelligence? We always like to do the things in the way we love. Here you go & happy watching (learning)

 

final_finally

Thanks to AV for this.

Top 8 Viz features in Excel 2016 !

This is especially for the excel lovers! In this blog, we will see few of the new and exciting data visualization features of Excel 2016.

Here is the list of new features

  1. Hierarchy Chart/Tree Map
  2. Sunburst
  3. Water fall or Stock Chart
  4. Transform Cold data into a cool picture
  5. Instant Histogram
  6. Pareto Chart
  7. 3D map
  8. One click forecast

These are the most wanted charts by the Dashboard creators. These are very simple and attractive. This set of features makes excel more competitive with other expensive visualization tools.

  1. Hierarchy Chart/Tree Map:

Select the data that you want to use for creation of the chart then Go to ‘Insert’ tab > Charts > Insert Hierarchy Chart

Hier

Isn’t it cool? OK, we go to the next one.

2. Sunburst/Donut Chart:

It is another representation of a Pie chart. An alternate to boring the Pie chart. Go to ‘Insert’ > Charts > Insert Hierarchy ChartSunburst

3. Water fall or Stock Chart

It is recommended to sort the data by any order to have the better insights.Screenshot 2016-01-02 12.13.11.png

4. Transform Cold data into a cool picture

This one is based on the Add-ins.

Screenshot 2016-01-02 13.10.54

Select your data to visualizeScreenshot 2016-01-02 12.21.56Screenshot 2016-01-02 12.22.02

Select ‘Settings’ to change the design of the chartsScreenshot 2016-01-02 12.24.11

5. Instant Histogram:

Create histograms quickly instead of going to “Analysis Tool Pack” in add-ins. Go to Insert > Charts > Histogram

Screenshot 2016-01-02 13.38.51.png

6. Pareto Chart:

Earlier, we had to customize the data structure to create ‘Pareto chart’ but now it is just a click away to explain the 80/20 principle.

Screenshot 2016-01-02 13.50.36.png

7. 3D map:

Power Map, the popular 3-D geospatial visualization add-in for Excel 2013, is now fully integrated into Excel. We’ve also this feature a more descriptive name, “3D Maps”. You’ll find this functionality alongside other visualization features on the Insert tab.

Screenshot 2016-01-02 13.55.08

It will open another sheet like below Screenshot 2016-01-02 14.00.36.png

then we can change the theme and other options like ‘2D Map’. “Play Tour” option will show an awesome chart with lively visual.

Screenshot 2016-01-02 14.02.13Screenshot 2016-01-02 14.03.48

8. One click Forecast

It has become more easy for the Data analysts who do forecast.

Select the data that you want to forecast and Go to ‘Data’ tab > Click on “Forecast Sheet”

Screenshot 2016-01-02 14.11.35

Adjust the “Seasonality” appropriatelyScreenshot 2016-01-02 14.17.37

Screenshot 2016-01-02 14.18.19

and your forecast is ready.

Hope you like these features and much more to come from Microsoft. Try these things and enjoy !

Data Viz ! Cheat sheet for R Data Analyst

Data visualization has become a vital slice of data science arena. Hence, our key tool should have strong capabilities on both the fronts – data analysis as well as data visualization. With this revolution in the landscape, or has extended immense popularity because of its splendid data visualization capabilities. With a few lines of code, you can produce beautiful charts and data stories. R contains superb libraries to create basic and more evolved visualizations like Bar Chart, Histogram, Scatter Plot, Map visualization, Mosaic Plot and various others. Below is the cheat sheet of widespread visualization for representing data. Thanks to my colleague for sharing this.

Data Viz Cheat Sheet

Introducing cricketr! : An R package to analyze performances of cricketers

A very good analysis using R in the field of cricket. Must see ! 🙂

Giga thoughts ...

Yet all experience is an arch wherethro’
Gleams that untravell’d world whose margin fades
For ever and forever when I move.
How dull it is to pause, to make an end,
To rust unburnish’d, not to shine in use!

Ulysses by Alfred Tennyson

Introduction

This is an initial post in which I introduce a cricketing package ‘cricketr’ which I have created. This package was a natural culmination to my earlier posts on cricket and my completing 9 modules of Data Science Specialization, from John Hopkins University at Coursera. The thought of creating this package struck me some time back, and I have finally been able to bring this to fruition.

So here it is. My R package ‘cricketr!!!’

This package uses the statistics info available in ESPN Cricinfo Statsguru. The current version of this package only uses data from test cricket. I plan to develop functionality for One-day and…

View original post 2,667 more words