Di’s Lessons in Software Engineering — Abstraction

Di Fan
7 min readAug 16, 2019

Foreword

This is the first of a series of articles I am writing for beginner programmers, especially those who chose non-CS majors for their formal education. I described my journey of switching careers in a previous Medium article, and I knew how learning programming as a beginner can be overwhelming or even intimidating. There are plenty of tutorials and courses online or elsewhere that teach you how to do this or that; however, little guidance exists to help new programmers develop the higher-order skills and principles. Even though often people learn programming by doing it, there are still need to periodically organize all the knowledge you have acquired, connect the dots, and build a conceptual framework for strong and competent software engineering practice.

In this series of articles, I want to share the concepts I find most important about becoming a competent programmer, with a healthy disregard of all the technical details.

Lesson 1. Abstraction

The source of programmers’ power.

The First Program — Abstraction — When and Why to Abstract

The First Program

Many non-programmers decided to break into software engineering because they have ideas about things they want to build. In my case, I wanted to build a workflow to automate my tax preparation day job. However, a lot of these new programmers struggle to make their first real program; they are not sure where to start — it’s difficult for them to translate their ideas into codes. To see how the process should be carried out, let’s work through a simple example.

Say I want to create a program that reads my electricity bill in PDF format, and extract out the amount due I have to pay. I want to create this program, because the electricity company email me every month with the PDF bill, but never tells me what amount is due in the body of the email, so I have to open the email attachment and find it. Duh, too much work!

So, how to find the amount due in a bill? Well, I look at the bill and find where the amount due is listed, and look at the number on that line. This is a pretty simple operation for humans to perform; however, for computers, I need to be a bit more specific. A computer program does not know by itself where to find this amount, so I have to give it very direct and clear instructions:

# Lang: python
# Let’s start with a document. Every line that starts with '#'
# is a comment, i.e. not code
doc = get_document() # This function is not defined yet!
pages = get_pages(doc) # Same here and below. It’s fine!
# Go through the pages.
for page in pages:
content = get_content(page)
if has_amount_due(content):
amount_due = get_amount_due(content)
do_something_with(amount_due)
else:
continue # Go to the next page.

Here it is, I have my first program. Apparently, it’s not complete, but it demonstrates the basic process to find the amount due from a bill that even a computer can understand. For the sake of clarity, let’s go through it step by step:

  • First, the document is opened and assigned to a variable called doc, which represents the PDF document, i.e. the bill.
  • Then, the program gets all the pages of this document.
  • Next, it goes through the pages in a so-called ‘for loop’. In other words, the program will repeat a set of operations for each of the pages, and the operations to be performed are the statements right below the line with the for loop.
  • Inside the loop, the program first reads the page and gets its content.
  • Then, in a conditional statement, the program checks whether the content has the data I am looking for, e.g. amount due.
  • The conditional statement can either be evaluated to either true or false. If it is true, i.e. the page has the data I am looking for, the program get the data from the content; otherwise, the program continues to the next iteration and checks the next page.

Note that I used a lot function calls in this little program, which are in the form of func_name(arguments). Just as the function you learned in math classes, i.e. y = f(x), these functions take some number of arguments (can be zero, one, or many), and return something (or nothing, in some case called ‘void’).

As you can see, my program has all the details necessary to demonstrate all to perform the ‘extract amount due from a bill’ task at a high level. The fact that some functions are used without implementation is totally fine, because they can be addressed later, separately. In reality, I can declare the functions I need first and maybe even gives them dummy implementations.

# Declare the get_amout_due without actually implementing it
# This function takes one argument called ‘content’, which is a
# string, i.e. a series of text characters, and returns a float,
# i.e. decimal number
def get_amout_due(content: string) -> float:
# TODO(Di): implement for real.
return 3.1415926

Note: the program used in the section can be an actual program, I haven’t fully implemented it yet, but it is not pseudo codes, which doesn’t follow programming language syntax.

Abstraction

This ability to address a problem at a high level without regard to the details, is called abstraction. Use the above program as an example: I need to get the amount due from the content somehow, but at first I don’t care about how. Initially, I only care about on a high level what the program looks like, this breaks the big program into many smaller programs, which can in theory be tackled independently.

Wikipedia once had a really nice definition for abstraction:

“Abstraction tries to factor out details from a common pattern so that programmers can work close to the level of human thought, leaving out details which matter in practice, but are not exigent to the problem being solved.”

This brings a lot of benefits:

  • The abstracted code is comparable to human thoughts, just as how earlier the program was laid out step by step, all described in human language.
  • Ideally the abstraction should be well encapsulated. In other words, changing how one step works should not change the other, which would allow: 1) programs to be designed without the programmer knowing in advance how to make the program work 100% as expected, as 2) the problem can be approached in one small piece at a time, and 3)it might be possible to give each of the smaller pieces to a different developer / team, potentially allowing the organization to work in parallel.
  • Just as some repetitive operations, e.g. looking for an amount due on a page for each page in a document, can be implemented once and used again and again in a loop, common functionalities can be created and then used over and over again.
  • This is basically what happens with softwares and websites: they are created only once, but then distributed and used by many people.

We can probably say abstraction is what makes programming so powerful in practice. Actually, Edsger W. Dijkstra had already said that:
“The effective exploitation of his powers of abstraction must be regarded as one of the most vital activities of a competent programmer.”

When and Why to Abstract

Abstraction is an important tool to help programmers structure programs and to help them create programs from scratches without necessarily knowing everything. This magical effect is achieved by removing unimportant details from the problem, and focuses the programmers’ energy on the most important decision first. Given the purpose abstraction, it is relatively easy to guess when this awesome tool would be helpful to you.

First, abstraction is useful when designing a program. Big problem is broken down into smaller problems here, which reflects the models we have for the problem at hand. Each smaller problem might be further divided if necessary. At this point, it is not important to figure out how to solve small problems yet, at least not until the high level design is sound.

New programmers often struggle with reading files, parsing command line arguments, or getting a server running, but at design time, these are really dirty details that can be reduced to a separate concern. In other words, when designing your program, just note that you need something that opens a file, but don’t worry about how to do it yet.

Another amazing thing happens here is that, when a big problem is divided into smaller problems, some of these smaller problems might have been solved by someone else already. This, again, is the power of reusable software and the open source community. Moreover, on the most fundamental levels, all problems are comprised of some combination of the basic computable problems.

Second, abstraction helps relieve the pain when there is a general lack of abstraction in the code. Sometime you have these super detailed code that go on for page after page, and something happens on each line, but you very quickly lose track of what’s going on, because the code contains too much detail and is not properly abstracted.

Code like this is often called technical debt, as it probably started as a simple program / function, with incremental additions made to it over time. The code never got properly re-organized to remove or hide away the heavy details, making the code hard to understand and fragile to change. You might have guessed that this kind of thing happens because of poor design, e.g. in the last step I just talked about. That might often be true, but sometimes software simply grows too fast, requiring the programmer to examine the sufficiency of existing abstraction.

At this point , there are two very important things worth stressing:

  1. Code is written by humans to be read by humans; it should be easy to understand. Machines just happen to execute the code.
  2. The power of software comes from the fact that it is soft. Software can be easily changed to adapt to the ever-changing reality. Rigid code fragile to changes is very costly.

These also describe the artisanship of software engineering. Now you can see why Dijkstra said abstraction is vital for good programmers.

Next

Well, I described the occasions where abstractions would help and why, but I did leave out the most important and practical part of the advice: How to abstract? In the next article, I will give a tour for the most well known abstractions in the programming world, and talk about some practical advice for performing the act of abstraction.

--

--

Di Fan

Traveler, Reader, Dreamer. Writing highly deletable codes.