Today is Wednesday, and it’s time for a weekly recap of interesting news and events in the data analysis world from the Data Analysis Journal.
If you are just now finding out about Data Analysis Journal, you can subscribe here. If you enjoy it, please forward this email to your friends!
✨ Today we will be discussing:
The ongoing Meow attack that hit 12K unsecured databases, deleting all user data.
The top 10 must-read blogs on data analysis.
All the things you can do in SQL.
Streaming data from Amazon S3 to Amazon Kinesis.
Registering for a free online Python workshop from PyData Global.
📚 Weekend Longread
Data Engineering. Read this article from the AWS Big Data blog which describes a solution for converting batch processing to near real-time using AWS DMS.
This solution is meant to stream the data from Amazon S3 to Kinesis for analysis, production line monitoring, or supply chain optimizations. The author does a thorough yet accessible job of walking you through how to set up kinesis as an AWS DMS target to allow multiple systems to consume data simultaneously.
🔥 What’s new this week
Excel turned 35 years a few weeks ago! Remember when we just scrawled in the dirt with sticks to keep track of our numbers before that?
A few months ago I wrote about an ongoing Meow attack that wiped out 1,000 unsecured databases. Hackers deleted all user data, leaving only the word “Meow” behind. They didn’t stop there. This month that number has increased to over 12,603 affected databases, among which are Elasticsearch, Redis, or MongoDB. A Meow attack might sound like a charming experience where a cat leaps into your arms, but it’s unfortunately far more damaging than that.
Check out a nice walkthrough about how to turn Postres table into NumPy array and how to take advantage of user-defined Python functions supported in Postgres - and why to use Python in Postgres at all.
Want to learn to Python but don’t know where to start? PyData Global is running a free online Python beginners workshop at PyData on November 7th. Hurry up, as numbers are limited, and the preference goes to applications from under-represented minorities in tech. Register here: https://humbledata.org/event/pydataglobal2020.html
🏆 All the great things you can do with SQL!
Being a big SQL enthusiast, I love watching how people use SQL not only to extract data, but to perform advanced formatting, and even develop and code programs, algorithms, and ML. In SQL. Yes.
Read a step-by-step guide created by an engineer who used SQL to develop a simple anomaly detection.
Check out aito.ai - a predictive database and the next generation of ML. Its SQL queries can not only extract but also predict data. Hello, future.
And here you can get inspired by someone who apparently got bored at work and developed a battleships game using PostgreSQL.
🎓 Level Up
Certifications, internships, schools, and courses.
Today I wanted to share my top 10 favorite blogs that I have been reading for a while on data analysis, marketing, technology, coding, and such. If you work with data, I highly recommend subscribing to these to stay in the loop of all the newest technology updates, research, offers, or deep-dive cases.
Data analysis:
Avinash Kaushik's Blog on digital marketing and analytics. I have been reading Avinash’s blog for many, many years now and have adopted many analytical concepts from him. Also, I highly recommend his book Web Analytics 2.0 on developing a strategy and applying techniques to measure marketing campaigns and experimentation.
Reforge Blog - read about systems and frameworks for data analysis. If you enjoy Reforge materials, you might also like Brian Balfour's Newsletter on growth, strategy, and user acquisition.
Analytic Bridge - a DS Central Community Channel focused more on Analytics and Business Intelligence. Data Science Central is another popular blogging platform. That being said, I stopped reading it after a few cases with fraud authors misrepresenting themselves and simply reposting content from Medium.
Product Analysis:
GrowthMarketer - all in one place to learn the best growth marketing tools, tactics, and strategies. Subscribe to read about conversion optimization practices, digital marketing strategies, must-know tools, and concepts.
Product-Led Growth Collective - read articles about product development and product growth with a heavy focus on SaaS.
Data science and engineering:
DB Weekly - a weekly round-up of database technology news and articles covering new developments, SQL, NoSQL, document databases, graph databases, and more.
Postgres Weekly - a weekly email round-up of Postgres news and articles.
Towards Data Science - a very popular Medium blog about data science concepts, ideas, and code.
KDnuggets is another well-known blog about ML, DS, Big Data, Analytics, and AI.
Analytics Vidhya is a good resource for ML and statistics. Make sure to check out their Glossary of common statistics terms. Very helpful.
Other interesting links and resources:
R Bloggers - Everything you could want to know about R, written by R users.
Real Python is not a blog, but it's my favorite best source for learning Python.
Day Zero: Always Learning - a blog about entrepreneurship, technology, the Cloud, and SaaS.
Lenny's Newsletter on product, growth, and people management.
🍸 Drink and Mingle
Upcoming free events, meetups, talk, webinars
Oct 14, Tableau: Mitigating Bias in Analytics
Oct 15, Slido: Customer-Centric Product Innovation
Oct 17, GDH UK&Ireland: DevFest 2020
Oct 20, Neo4J: NODES 2020 Neo4j Online Developer Expo and Summit
Oct 20, Galvanize: Intro to Computer Vision
Oct 21, ProductBoard: Product Excellence Summit 2020
Oct 28, Anaconda: Working with Data in the Cloud
Oct 28, DSS: Credit Risk - Why Model Fairness Is Needed
🙏 Do Some Good.
It pays back.
I have a favor to ask. As my audience grows, I’d like to understand better how this journal can stay interesting and help you to become a better data analyst, data scientist, and progress in your career. I’d appreciate it if you could take a moment to fill out this brief survey.
Stay safe everyone! Until next Wednesday!