- Published on
You Should Know SQL
- Authors
- Name
- Linell Bonnette
You Should Know SQL
I like being a âfull stackâ developer. Having the ability to think up a new feature and personally see it go from just an idea all the way to a finished product gives me not only a great sense of pride but also the ability to work more autonomously. I donât need to wait for another team to build out an API or for the front end to finalize a UI mockup. I can just sit down and make things happen.
But.
Specialization isnât a bad thing either. For example: Iâm just not great at creating beautiful user interfaces. The ones that I create work well and theyâre not necessarily ugly, but there are people on my team who can do them much better than I can. So they do. Duh. Itâs hard to argue that the UI person on the team should know the intricacies of an ETL system if their day to day work never needs them to â they should focus on whatâs providing the most value for their team.
The biggest place I personally see this going afoul is when it comes to interacting with data. I think that just about everyone should be familiar enough with SQL to probe around in a database without any help. Itâs not a hard skill to learn the basics of and will really up your game as a developer.
No Waiting for Information
Whether youâre a product manager or a UI/UX Designer, knowing how to write a query to answer your questions is invaluable and will increase your productivity.
For example: letâs say youâre working on refactoring a portion of an application and you realize that if you make a little tweak userâs experience will be much nicer overall. The only problem is that, for some reason, itâs not backwards compatible. You could just shrug it off and ignore the tweak â after all, you canât just break something that tens of thousands of people use, right? You could also just ignore the problem and forge ahead with the new tweak â thereâs an argument that if itâs easier than before itâll be used more anyway, so letâs move fast and break things, right? Alternatively, you could make an informed decision by figuring out the exact number of users that would be affected. The quickest way to do that is likely by writing a SQL query to check something in your database. Instead of having to ask someone else for help, or even just punting to your boss, wouldnât it be great to be able to fetch the information yourself and go from there?
Itâs Like Riding A Bike
Itâs worth saying again that Iâm not saying every developer needs to be able to give a deep dive into PostgreSQLâs internals. Knowing how to write a select statement and how to join tables is enough to figure out the vast majority of things youâll realistically encounter. The beauty of this is that, for the most part, this basic knowledge is extremely portable.
I can write pretty much the same query for querying a PostgreSQL database, a MySQL database, and CloudWatch logs via Amazon Athena. Just about any system that stores data is going to use a syntax familiar enough that it wonât be a huge hurdle to figure out. The point here is that once youâve learned enough SQL to be dangerous, youâve learned enough to take that knowledge and apply it across a myriad of different applications.
Write Better Code
Okay, so maybe you never need to actually delve into any data and you donât foresee that ever coming up in your life. You should still learn SQL because it will help you write better code. How?
The ability to know that what youâre working on is slow because the underlying query is filtering by columns without an index is a big deal. Instead of either ignoring the problem and having slow code or writing some sort of caching or abstraction to make things seem faster, you can just fix the real problem. The important part is that you knew enough about the database to realize that the problem may lie there.
Knowing SQL helps me understand what the heck my ORM is doing, why itâs doing that, and ways I may be able to improve my codeâs performance. For example, letâs say I want to find every Contact
(which belongs to a Student
) a User
has access to based off of the students in that userâs scope.
students = Student.scope_to_user(user_id).map(&:id)
Contact.where(:student_id => students).all
The above code is doing two queries, and on one of those queries weâre also looping all the way through the dataset.
-- first query, which we then loop over via .map(&:id)
select *
from students
where id in user_student_scope('user_id');
-- second query
select *
from contacts
where student_id in ('list', 'of', 'student', 'ids');
Because we know SQL we know that we can do better than that, though.
students = Student.scope_to_user(user_id).select(:id)
Contact.where(:student_id => students).all
This doesnât look very different, but what weâve done is write code that only makes one query and avoids looping over anything. The generated SQL is something like:
select *
from contacts
where student_id in (
select id
from students
where id in user_student_scope('user_id')
);
Now obviously youâve got to know the ORM a little bit for this particular example but the point stands: without knowing SQL you wouldnât know why the first example is less performant code than the second.
Iâm Not Alone
Who cares what I think anyway, right?
- SQL: One of the most valuable skills - Craig Kerstiens
- Nine Reason Developers Should Learn SQL
- What ORMs have taught me: just learn SQL
- Should experienced programmers know database queries? - Software Engineering Stack Exchange
If that's still not enough, here's a little checklist that I think sums it up well:
Server performance checklist: 1) its usually SQL 2) you are using SQL inefficiently 3) too much SQL 4) your ORM is shit 5) its usually SQL
â Jeff Atwood (@codinghorror) February 1, 2014