Skip to content

Data Science & Machine Learning Newsletter #102

Do you want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group Gra­di­ent Boost­ing explained “Gra­di­ent boost­ing (GB) is a machine learn­ing algo­rithm devel­oped in the late ‘90s that is still very pop­u­lar. It pro­duces state-of-the-art results for many com­mer­cial (and aca­d­e­mic) appli­ca­tions. This page explains how the gra­di­ent boost­ing algo­rithm works using sev­eral inter­ac­tive visu­al­iza­tions.” AI and Deep Learn­ing in 2017 – A Year in Review Text Devamını Oku […]

Data Science & Machine Learning Newsletter #101

Do you want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group Python mod­ule to per­form under sam­pling and over sam­pling with var­i­ous tech­niques “imbalanced-learn is a python pack­age offer­ing a num­ber of re-sampling tech­niques com­monly used in datasets show­ing strong between-class imbal­ance. It is com­pat­i­ble with scikit-learn and is part of scikit-learn-contrib projects.” On Machine Learn­ing and Pro­gram­ming Lan­guages “While machine Devamını Oku […]

Data Science & Machine Learning Newsletter #100

Do you want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group The 10 Sta­tis­ti­cal Tech­niques Data Sci­en­tists Need to Mas­ter “data sci­en­tist is a per­son who is bet­ter at sta­tis­tics than any pro­gram­mer and bet­ter at pro­gram­ming than any sta­tis­ti­cian.” Unleash a faster Python on your data “Get real per­for­mance results and down­load the free Intel® Dis­tri­b­u­tion for Python that includes every­thing you need for blazing-fast com­put­ing, Devamını Oku […]

Data Science & Machine Learning Newsletter # 99

You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group Ulti­mate Guide To Sta­tis­ti­cal Sig­nif­i­cance Tests in R Imple­ment and inter­pret the com­monly used sta­tis­ti­cal sig­nif­i­cance tests in R, the pur­pose, when to use and how to inter­pret the result. Word2Vec (skip-gram model) The skip-gram neural net­work model is actu­ally sur­pris­ingly sim­ple in its most basic form. Train a sim­ple neural net­work with a sin­gle hid­den layer to per­form Devamını Oku […]

Data Science & Machine Learning Newsletter # 98

You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group CAN (Cre­ative Adver­sar­ial Net­work) - Explained GANs (Gen­er­a­tive Adver­sar­ial Net­works), a type of Deep Learn­ing net­works, have been very suc­cess­ful in cre­at­ing non-procedural con­tent. This work explores the pos­si­bil­ity of machine gen­er­ated cre­ative con­tent. 109 Com­monly Asked Data Sci­ence Inter­view Ques­tions For a data sci­ence inter­view, an inter­viewer will ask ques­tions Devamını Oku […]

Data Science & Machine Learning Newsletter # 97

You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group How to Apply Machine Learn­ing to Event Pro­cess­ing How do you com­bine his­tor­i­cal Big Data with machine learn­ing for real-time ana­lyt­ics? An approach is out­lined with dif­fer­ent soft­ware ven­dors, busi­ness use cases, and best prac­tices. J.P.Morgan’s mas­sive guide to machine learn­ing and big data jobs in finance J.P. Morgan’s quan­ti­ta­tive invest­ing and deriv­a­tives strat­egy team, Devamını Oku […]

Data Science & Machine Learning Newsletter # 96

You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group Python Plot­ting for Exploratory Analy­sis What method can be used to detect sea­son­al­ity in data? A really good way to find peri­od­ic­ity in any reg­u­lar series of data is to inspect its power spec­trum after remov­ing any over­all trend. Word2vec with tens of bil­lions of items, what could pos­si­bly go wrong? Project related to text sum­ma­riza­tion on Github: https://github.com/miso-belica/sumy https://github.com/abisee/pointer-generator https://github.com/LazoCoder/Article-Summarizer https://github.com/hengluchang/newsum https://github.com/sriniiyer/codenn https://github.com/davidadamojr/TextRank An Devamını Oku […]

Data Science & Machine Learning Newsletter # 95

You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group 7 Tech­niques to Han­dle Imbal­anced Data This blog post intro­duces seven tech­niques that are com­monly applied in domains like intru­sion detec­tion or real-time bid­ding, because the datasets are often extremely imbal­anced. Bayesian Deep Learn­ing with Edward (and a trick using Dropout) Ele­gant N-gram Gen­er­a­tion in Python A quick few snip­pets of code – solv­ing how to Devamını Oku […]

Data Science & Machine Learning Newsletter # 94

You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group Accu­rate, Large Mini­batch SGD: Train­ing Ima­geNet in 1 Hour Deep learn­ing thrives with large neural net­works and large datasets. How­ever, larger net­works and larger datasets result in longer train­ing times that impede research and devel­op­ment progress. Dis­trib­uted syn­chro­nous SGD offers a poten­tial solu­tion to this prob­lem by divid­ing SGD mini­batches over a pool of par­al­lel work­ers. … Time Devamını Oku […]

Data Science & Machine Learning Newsletter # 93

  You want to get updates? Please join Data Sci­ence & Machine Learn­ing Newslet­ter Linked Group Fea­ture Engi­neer­ing: Data scientist’s Secret Sauce ! It is very tempt­ing for  data sci­ence prac­ti­tion­ers to opt for the best known  algo­rithms for a given problem.However It’s not the algo­rithm alone , which can pro­vide the best solu­tion  ; Model built on care­fully engi­neered and selected fea­tures can pro­vide far bet­ter results. pix2code: Gen­er­at­ing Code from a Graph­i­cal Devamını Oku […]