24 Best Python Libraries You Should Check in 2023. Batteries included python

Batteries included: using Python scripts in BuildIT

Did you know that you can use Python scripts in BuildIT? If you transform your Python files into executable files, you can import them into BuildIT processes. With this workflow, you can combine the richness of Python libraries with the ergonomic practicality of BuildIT processes.

In this post, we’ll FOCUS on a very basic workflow for generating a random number in Python, which we will then use within a BuildIT process to move a constructed point.

Using Pyinstaller to create.exe files from Python files

In this section, we will transform an existing.py file into a.exe file. We used Pyinstaller to do this, but there are other software options available for this purpose.

  • Install Python
  • Install pywin32
  • Install Pyinstaller
  • In a text editor, add the contents of the Python file we will use: #! python import random a = random.gauss(10,1) print(a)
  • To build the executable, write in the command line: pyinstaller.exe name_of_the_file.py

Or you could also follow these instructions to create an executable from a python script for more information.

Using the.exe file inside the BuildIT process file

In this section, we will explain the steps to include the created Python file in a BuildIT process.

  • Open a BuildIT process
  • Click on Add command: Category: General: Command: Shell command
  • In the added Shell command, change the parameters as follows: Script: The path of your.exe file created by PyinstallerReturn Variable: SHELLRESULTAdded Shell command parameters
  • Click on Add command:Category: Construct: Command: Point: Coordinates
  • Click on Add command: Category: Edit: Command: Move
  • In the added Move command, change the parameters as follows: Distance: (SHELLRESULT)Added move command parameter

That’s it! When you run the process, the created point will move randomly.


Our aim here was to give a very simple example of plugging Python “batteries” into the BuildIT power tool. Using this powerful combination, we can imagine and implement many combinations and potentialities for various applications. We’ll expand more on this in the coming posts.

Best Python Libraries You Should Check in 2023

Python is often labeled a “batteries-included programming language.” This simply means that it comes with a number of prepackaged libraries developers can use to make their jobs much easier. As you may expect, there is a sea of libraries available for this interpreted, high-level, general-purpose programming language.

There’s no doubt that one of the biggest reasons Python is so popular is the fact that there are over a hundred thousand libraries available to choose from. The more libraries and packages a programming language has at its disposal, the more diverse use cases it can have.

Out of all the thousands and thousands of Python libraries available, it can be difficult to tell which ones are good and which ones are forgettable. To help you out, we’ve listed some of the best Python libraries in this article. Is your favorite on this list? Read on to find out!

What is a Python Library?

Before we can answer this question, first, we must talk about what a library is — at least in terms of programming. Libraries are comprised of classes, utility methods, and modules. As you code your applications, these components can come in quite handy. Rather than having to write code from scratch, you can use library components to perform certain tasks within your code. As a result, you save a lot of time and effort. Plus, libraries make code reusable while also establishing a standard among developers.

So what exactly are libraries in Python?

As one of the most widely used programming languages in recent years, Python is used for a huge range of purposes and applications. One of the biggest reasons why Python is so popular is because it comes with a massive variety of open-source libraries that are not just free, but also easy to use.

Python libraries are collections of helpful modules, functions, classes, and more. These libraries help developers speed up their processes by working with preexisting code without the need to reinvent the wheel. Needless to say, libraries allow developers to FOCUS on the important parts of their applications since they no longer have to code everything from scratch.

It’s worth noting that because Python is used in such a wide variety of industries, there are top Python libraries for just about any purpose you can think of.

What to Consider When Choosing a Python Library to Use

Now that you know what a library in Python can do for you, the next question on your mind might be, “how do I choose the right one?” It’s completely understandable to wonder this — after all, with more than 137,000 Python libraries available to date, how do you decide which one is the best for your needs?

When you’re faced with such a massive selection, it can be hard to make a decision. Some might even feel paralyzed, unsure how to go about making their choice. And to some, the end result might even be them choosing to simply code what they need from scratch. You don’t need to go through that.

Here are some things to consider when choosing from the best Python libraries:

  • What is your intended purpose? Knowing the primary purpose or intent of your project is important to help you narrow down the list of viable Python libraries. To further shrink your pool of selections, consider any additional fields, purposes, and specialties that may intersect with this primary purpose. For example, if your project is data science-focused, you’ll likely want a library that can also support data management and data visualization.
  • What version of Python are you using? These days, there are different versions of Python you can use for your projects. If you’ve chosen a certain version for your application, you must then make sure that any libraries you use are compatible with the said version of Python.
  • Will this library work with the other libraries you are using? If you are using multiple libraries, it’s a good idea to make sure that they all work well with each other. Incompatible or overlapping libraries may cause you more trouble than they are worth.
  • Will the library fit your budget? There is an abundance of open-source Python libraries you can use entirely for free. If you can find some that suit your project perfectly, you might not even need to pay for any libraries at all. However, there are some libraries that will require you to pay for access. You may want to consider a library’s cost before you proceed with your decision.

Top Python Libraries in 2023


Primary Intent: Making HTTP requests simpler

Secondary Intent(s): None

One of the most popular general Python libraries is Requests, which aims to make HTTP requests simpler and more human-friendly. Licensed under the Apache2 license and written in Python, Requests is the de facto standard used by developers for making HTTP requests in Python.

In addition to using the Requests library for sending HTTP requests to a server, it also allows adding form data, content, header, multi-part files, etc. with them. With the library, developers need not add a query to the URL or form-encode the POST data manually.

The Requests library abstracts the numerous complexities of making HTTP requests in a simple API so that developers can FOCUS more on interacting with services. The library offers official support for Python 2.7, 3.4, and above and works great on PyPy too.

  • Allows multipart file uploads and streaming downloads
  • Automatic content decoding and automatic decompression
  • Browser-style SSL verification
  • Features can be customized and optimized as per requirements
  • Keep-Alive Connection Pooling
  • Supports international domains and URLs


Primary Intent: Image manipulation

Secondary Intent(s): Image archival, image display

Python Imaging Library or PIL is a free Python library that adds an image processing ability to the Python interpreter. In simple terms, PIL allows manipulating, opening, and saving various image file formats in Python. Created by Alex Clark and other contributors, Pillow is a fork of the PIL library.

In addition to offering powerful image processing capabilities, Pillow offers an effective internal representation and extensive file format support. The core Python library is designed to offer fast access to data stored in a few basic pixel formats.

  • Effective debugging support using the show method
  • Ideal for batch processing applications
  • Identifies and reads a vast range of image file formats
  • Offers BitmapImage, PhotoImage, and Window DIB interfaces
  • Supports arbitrary affine transforms, color space conversions, filtering with a set of built-in convolution kernels, image resizing and rotation, and point operations
  • The histogram method allows pulling some statistics out of an image, can be used for automatic contrast enhancement and global statistical analysis


Primary Intent: Web scraping

Secondary Intent(s): Automated testing, data mining, web crawling

Scrapy is a free and open-source Python framework that is widely used for web scraping and a number of other tasks, including automated testing and data mining.

Initially, Scrapy was developed for web scraping but has evolved to fulfill other purposes over the years. The library offers a fast and high-level method for crawling websites and extracting structured data from web pages.

Written in Python, Scrapy is built around spiders that are basically self-contained crawlers, which are provided a set of instructions. Abiding by the DRY (don’t repeat yourself) principle, Scrapy makes it easier to build and scale full-fledged web crawling projects.

  • Easy to write a spider to crawl a website and extract data
  • Follows the DRY principle
  • Offers a web-crawling shell that allows developers to test a website’s behavior
  • Supports exporting scraped data using the command line


Primary Intent: Working with the asynchronous code

Secondary Intent(s): None

Numerous Python developers around the world make use of the asyncio library for writing concurrent code using the async/await syntax. In most cases, the asyncio library is ideal for IO-bound and high-level structured network code.

asyncio has been used for building various Python asynchronous frameworks that offer database connection libraries, distributed task queues, high-performance network and web servers, and much more. The library comes with a number of high-level and low-level APIs.

  • Allows controlling subprocesses, distributing tasks via queues, performing network IO and IPC, and synchronizing concurrent code
  • Bridge callback-based libraries and code with async/await syntax using low-level APIs
  • Comes with a set of high-level APIs for concurrently running Python coroutines and having full control over their execution
  • Eases working with asynchronous code
  • Supports the creation and management of event loops, implementing effective protocols using transports


Primary Intent: GUI development

Secondary Intent(s): None

When used with Tkinter, Python offers an easy and fast way of creating GUI applications. Tkinter is the standard GUI library for the Python programming language. It offers a powerful object-oriented interface for the Tk GUI toolkit.

Creating a GUI application using Tkinter is very easy. All you need to do is to follow these simple steps:

  • Import Tkinter
  • Create the main window for the GUI application under development
  • Add one or more Tkinter Widgets
  • Enter the main event loop for taking action for each user-triggered event

Tkinter offers over 15 types of widgets, including buttons, labels, and text boxes. Each of them has access to some specific geometry management methods that serve the purpose of organizing widgets throughout the parent widget area.

  • Comes with a range of widgets that support geometry management methods
  • Eases the development of GUI applications
  • Supports an effective object-oriented interface


Primary Intent: Compatibility library (wrapping over differences between Python 2 and Python 3)

Secondary Intent(s): None

Although simplistic, Six is a powerful Python library that is meant to smooth out the differences between various Python 2 and Python 3 versions. Six is intended for supporting codebases that can operate on both Python 2 and Python 3 without the need for modifications.

The Six library is super-easy to use thanks to being offered as a single Python file. Hence, it is ridiculously easy to copy the library into a Python project. The name Six reflects (Python) 2 x (Python) 3.

  • Simple utility functions for making Python code compatible with Python 2 and Python 3
  • Supports every version since Python 2.6
  • Too simple to use as contained in a single Python file


Primary Intent: Serve as an asynchronous HTTP Client/Server

Secondary Intent(s): None

Another simple yet widely used Python library is the aiohttp. It is basically meant to be an asynchronous HTTP client or server in Python. Beyond this and offering out-of-the-box support for Client WebSockets and Server WebSockets, there’s nothing more to this Python library.

  • Offers a web server that has middlewares, pluggable routing, and signals
  • Provides out-of-the-box support for both Client WebSockets and Server WebSockets
  • Supports both Client and HTTP Server


Primary Intent: 2D game development

best, python, libraries, check, 2023

Secondary Intent(s): Multimedia app development

Pygame is a free and open-source Python library that is meant for accomplishing multimedia application development in Python, especially two-dimensional gaming projects. Hence, it is widely used by both casual and professional Python game developers.

Under the hood, Pygame makes use of the SDL (Simple DirectMedia Layer) library. Like the SDL library, the Pygame library is highly portable and thus provides support for a wide number of platforms and operating systems.

It is possible to port applications developed using Pygame on Android-powered devices, like smartphones and tablets. For this very purpose, pgs4a (Pygame subset for Android) needs to be used.

  • Doesn’t demand OpenGL
  • Makes easy for using multi-core CPUs
  • No GUI is required for using all available functions
  • Provides support for a wide range of platforms and operating systems
  • Simple and easy to use
  • Uses Assembly code and optimized C code for implementing core functions


Primary Intent: Application development (with innovative user interfaces)

Secondary Intent(s): None

For building mobile apps and multi-touch application software with a NUI (Natural User Interface), Python developers rely on the Kivy library. The free and open-source Python library is distributed under the MIT license and runs on Android, iOS, Linux, macOS, and Windows.

In actuality, Kivy is the evolution of the PyMT project. It contains all the necessary elements for building an intuitive multi-touch application, namely a graphics library, a wide range of widgets with multi-touch support, an intermediate language (Kv), and extensive input support.

Kv, or the Kivy language, is an intermediate language dedicated to describing user interactions and interface. It makes it very easy to create a complete UI and add interaction(s) to it. Kivy also provides support for the Raspberry Pi.

  • Ability to natively use most devices, inputs, and protocols
  • Cross-platform
  • Offers over 20 highly-extensible widgets
  • Support graphics engine built over the OpenGL ES 2


Primary Intent: Developing visualization-based applications

Secondary Intent(s): Data visualization, data science

An interactive visualization library for the Python programming language, Bokeh allows visualizing data in a beautiful and meaningful way inside contemporary web browsers. The data visualization library eases the creation of dashboards, data applications, and interactive plots.

In addition to offering concise and elegant construction of versatile graphics, the Bokeh library also extends its capability with high-performance interactivity over streaming or very large datasets.

  • Allows building complex statistical plots with simple commands
  • Bokeh visualizations can be easily embedded into two of the most popular Python frameworks, Django and Flask
  • Capable of producing elegant and interactive data visualizations
  • Multiple language bindings (Julia, Lua, Python, and R)
  • Various output formats


Primary Intent: Scientific and numerical computing

Secondary Intent(s): Data analyses, forms foundations of other Python libraries like SciPy

NumPy is one of the best open source Python modules for scientific and numerical computing and data analyses. In fact, it even provides the foundation for a few other Python libraries, such as SciPy and Sci-Kit Learn. NumPy is most often used for mathematical operations with matrices and arrays. Thanks to its efficient yet quick computations, NumPy is the Python library of choice for many scientists performing data analyses.

NumPy can also process multidimensional arrays, which is why so many developers and data scientists use it for AI (artificial intelligence) and ML (machine learning) projects.

  • efficient thanks to array-oriented computing
  • Uses vectorization for compact yet faster computations
  • Supports an object-oriented or OO approach
  • Provides numerical routines in fast and precompiled functions


Primary Intent: Data visualization and manipulation

Secondary Intent(s): Linear algebra, optimization algorithms, image operations (multidimensional)

Like NumPy, SciPy is free and open source, making it accessible for everyone. SciPy is based on NumPy and can also be used for technical and scientific computing with large data sets. It plays a critical role in engineering and scientific analysis, which is why it’s also considered an important library in Python. Some even call it a foundational library for the programming language.

SciPy works well for the purposes of image manipulation. It has high-level commands often used for data manipulation and visualization.

  • Collections of functions and algorithms all built upon NumPy
  • Has inbuilt functions meant to solve differential equations
  • SciPy ndimage submodule for image processing (multidimensional)
best, python, libraries, check, 2023

Sci-Kit Learn

Primary Intent: Machine learning applications

Secondary Intent(s): Statistical modeling

Sci-Kit Learn is based on both NumPy and SciPy and was formerly known as Sklearn. This free Python library is often considered by many as a SciPy extension. Sci-Kit Learn was designed specifically for the purposes of developing algorithms for machine learning and for data modeling.

Many consider Sci-Kit Learn one of the top Python libraries thanks to its consistent, simple, and intuitive interface. Thanks to how user-friendly this library is, many consider it to be an excellent choice for beginners.

  • Machine-learning library
  • Offers practically all of the algorithms you need for machine learning
  • Built on SciPy, NumPy, and Matplotlib


Primary Intent: Machine and Deep Learning

Secondary Intent(s): Evaluating, analyzing, and manipulating mathematical expressions

The numerical computation library known as Theano was created for its express use in machine learning. Many developers use Theano in creating deep learning models thanks to the library’s features. The great majority of Theano users are deep learning and machine learning developers.

Theano also offers the ability to integrate with NumPy, should you ever have the need to. When Theano is used with a GPU (graphics processing unit, such as a video card) instead of a CPU (central processing unit, like the Intel Core i5 or i7 or the AMD Ryzen equivalents), it can perform its intensive data computations as much as 140 times faster.

  • Integrates with NumPy
  • Works with CPUs but works much more efficiently with GPUs; can perform intensive operations much faster using your GPU
  • Optimized for stability and speed
  • Employs multidimensional arrays for creating deep learning models


Primary Intent: Deep learning and traditional machine learning; large numerical computations

Secondary Intent(s): Text-based apps, video detection, speech/image recognition, time-series analyses

TensorFlow is an open-source library originally developed by researchers at Google.

Its specialty appears to be in differentiable programming, but its main purpose was intended to be machine and deep learning as well as other workloads in predictive and statistical analytics.

TensorFlow’s collection of resources and built-in tools help developers have a much easier time building their machine learning and deep learning models. TensorFlow also helps make building neural networks much more straightforward for developers, whether they are beginner-level or professional.

The framework and architecture of TensorFlow are highly flexible, allowing the library to be used with both CPUs and GPUs. However, if you wish to unlock TensorFlow’s full power, you’ll need to work with a TPU (Tensor processing unit). This library also isn’t limited to desktop devices — you can also use it on smartphones and servers.

  • Frequent updates and new releases guarantee fresh features and clean code
  • Backed by Google
  • Better visualizations for computational graphs


Primary Intent: Data science

Secondary Intent(s): Deep learning research

PyTorch is another open-source library often used for data science. This library, which was based Torch (a framework for the C programming language), can also integrate with other Python libraries like NumPy. PyTorch is capable of seamlessly creating computational graphs that can be changed as long as the Python program is currently running.

PyTorch is most often used in DL and ML applications, including NLP (natural language processing) and computer vision. This library is well-known for being able to execute fast, even when handling heavy loads. PyTorch is also flexible, allowing it to work on CPUs, GPUs, and even simplified processors.

Users can expand PyTorch using its collection of APIs.

  • Tensor computations using GPU acceleration for faster and more efficient processing
  • Simple, easy-to-use API
  • Uses dynamic computation graphs
  • Has a strong community behind it


Primary Intent: Deep Learning and Machine Learning

Secondary Intent(s): Data Visualization

The open-source Python library Keras was designed primarily for the development and evaluation of neural networks inside machine learning and deep learning models. This library can function on top of TensorFlow and Theano, which means developers can start training their neural networks with not much code.

Keras is flexible and extensible while also being modular, which is why it’s a great choice even for beginners. This library is also portable, which means you can use it in various environments and on both GPUs and CPUs.

Developers also often use Keras for data visualization or modeling.

  • Supports Theano and TensorFlow backends
  • Provides prelabeled data sets developers can use directly to load/import
  • Offers simple and consistent APIs
  • Easy-to-learn and use, smaller learning curve


Primary Intent: Data Science

Secondary Intent(s): Data Analysis and machine learning

Pandas is one of the most popular Python libraries today, at least in the field of data science. Pandas is yet another library that was built on top of NumPy. This library allows users to build seamless yet intuitive high-level data structures. Pandas is used in a variety of industries, ranging from statistics to engineering and even in finance.

One thing that makes Pandas great is its flexibility and the ability to use it alongside other numerical and scientific Python libraries.

  • Used in many commercial fields, including finance, neuroscience, and statistics
  • Also used in academic fields
  • Has eloquent syntax
  • Built on top of NumPy
  • Helps take care of a lot of the tedious and time-consuming tasks related to data


Primary Intent: Data Visualization

Secondary Intent(s): Machine Learning

Matplotlib is an open-source Python library often touted as an alternative to the paid solution MATLAB. Matplotlib, a SciPy extension, was made for the purpose of data visualization as it’s used to create graphs and plots. Matplotlib can also work with the complex data models outputted by Pandas as well as data structures created by NumPy.

Matplotlib does have a limit — it can only do 2D plotting. Despite this fact, this library remains highly capable of producing publish-ready data visualizations in the form of plots, diagrams, histograms, plots, scatter plots, error charts, and of course, bar charts.

Because of how simple and intuitive Matplotlib is, many beginners choose to work with it when starting out in data visualization. It’s also the choice of many developers who already have plenty of experience with other data visualization tools.

  • Open source, makes for a good replacement for MATLAB (a paid solution)
  • Low memory usage
  • Strong community support
  • Offers various types of data visualizations (boxplots, scatterplots, bar charts, histograms, error charts, and more)


Primary Intent: Data visualization

Secondary Intent(s): Machine learning

Much like Matplotlib, Seaborn is a Python library made for plotting and data visualization. In fact, this open source library was based on Matplotlib itself, although Seaborn also includes some of Pandas’s extensive data structures. Seaborn has a high-level interface full of features that allows users to create statistical graphs that are not just accurate but also informative.

Many developers and Seaborn users will agree that this library creates some of the best-looking data visualizations, which is why this library is perfect for use in marketing and publishing applications.

Users also enjoy Seaborn for its ability to create these plots and graphs with simple commands and minimal code, making it a time-saver for many.

  • Built upon Matplotlib
  • Allows developers to create attractive, informative graphs using the high-level interface
  • Create a range of plots such as pairwise plots, histograms, bar plots, scatter plots, and more


Primary Intent: Data Science

Secondary Intent(s): Web scraping

Beautiful Soup received its name thanks to its ability to parse HTML and XML documents (even with malformed markup called “tag soup”). This Python package scrapes the web and collects data, preparing them for future manipulation. As an incredibly versatile package, Beautiful Soup is one of the tools of choice for many data analysts and scientists. Machine learning and deep learning developers also use Beautiful Soup to obtain data for training their ML/DL models.

  • Allows data extraction from HTML and XML, even from documents with malformed or incomplete markup (like non-closed tags)
  • Began as a parser for HTML that could make “tag soup” (malformed markup) workable or even ‘beautiful’


Primary Intent: Machine Learning

PyCaret got its name from being a Python library based on Caret, a machine learning library in the programming language R. This open source library was also created for machine learning, and as a result, it offers some features to help simplify and automate ML programs.

Although it has a bit of a learning curve, PyCaret is relatively easy to use.

  • High-level and low-code library
  • Works to automate workflows in machine learning
  • Helps to speed up experimental cycles, increasing productivity
  • Allows developers to deploy ML models using very little code


Primary Intent: Computer Vision and Image Processing

Secondary Intent(s): Machine Learning

As a Python library, OpenCV is comprised of various functions, making it a great tool for computer vision programs in real-time. This highly efficient library can process various visual inputs not just from images but also from video data. OpenCV can identify faces, handwriting, and objects.

  • Performs tasks like objection tracking, face detection, landmark detection, and more
  • Provides developers with access to more than 2,500 classic, state-of-the-art algorithms
  • Used extensively even by tech giants like Google, IBM, Toyota, and more
  • Also used in image/video analysis


Primary Intent: Machine Learning

LightGBM stands for Light Gradient Boosting Machine. It is a free gradient boosting framework that was developed by Microsoft for the purposes of machine learning. It is user-friendly and intuitive and can be learned much more easily than some other libraries for deep learning.Stop

  • Offers plenty of memory-efficient yet fast computational power
  • Originally developed by Microsoft
  • Capable of dealing with large amounts of data
  • Offers high accuracy results
best, python, libraries, check, 2023

That’s All!

Out of the hundreds of thousands of Python libraries available, the list above includes some of the very best. It’s good to know that these libraries often get upgrades and enhancements to help them keep up with Python’s growth and booming popularity.

best, python, libraries, check, 2023

Knowing one of these popular libraries can help further your learning of the language while also helping to make you a better Python developer all around. Think we missed a top library on this list? Let us know in the Комментарии и мнения владельцев below!

Frequently Asked Questions

What are Python libraries?

Python libraries are collections of functions, modules, and other components that allow developers to use preexisting code for certain tasks. Libraries can be general or more for specific purposes. They can save developers a ton of time and effort by preventing the need to code a huge chunk of an application from scratch.

How many libraries are there in Python?

There are more than 137,000 Python libraries in existence. However, not all of them are created equally, and you’ll find that some are much better than others.

What is a Python library example?

There are many hugely popular python packages and libraries. If you’re looking for Python library examples, consider some of the big names below:

  • Requests
  • Numpy, SciPy, SciKit-Learn
  • PyTorch
  • Pandas
  • Seaborn
  • Theano
  • TensorFlow

How do I get a list of Python libraries?

If you mean listing all of the modules, packages, or libraries currently installed in your version of Python, you can follow the guide here. The guide contains all the instructions you need for making a Python libraries list.

What are Python libraries used for?

Python libraries are used to make a developer’s job much easier and more convenient. Instead of needing to code portions of projects from scratch, developers can take modules and bundled code from libraries and use those in their projects instead. Libraries can also establish coding standards, making code maintenance easier to do.

Are all Python libraries free?

Most major libraries do allow free commercial use. However, not all libraries are as easy to use and figure out. Thus, although a library may be free, you may need to pay for licensed or paid modules or software to make debugging and maintenance easier in the long term. Additionally, although many libraries are free for commercial use, you may have to pay if you’d like to include certain modules in your applications for distribution to future customers.

How do libraries work in Python?

Python libraries allow developers like you to take modules and bundles of code and use them repeatedly for various projects and purposes. Libraries prevent the need for you to code things from scratch repeatedly, as you can simply take preexisting code and add it to yours.

People are also reading:

Batteries Included Build Automation

Clone repository, run local build… build failure. It works on the build server. It also works on Grant’s workstation. But it doesn’t work for you. Does this sound familiar? If so, then you may be a victim of Irritating Build Syndrome (IBS). Fortunately, there is a cure.

Nobody wants to chase down umpteen build dependencies just to get to the point where they can start work. It’s not just a problem for new team members. Anyone who has been in the software development game for any period of time has surely experienced a nero (near zero progress) day due to a broken local build environment.

A Brief History of [Some] Build Automation Tools

From Make to Ant to tools like Maven, Gradle and NPM, build automation has had a long evolutionary history. Decades ago, Ant was the cool new built tool on the block. Those who built Windows applications (including this author), remember the joys of trying to manage DLL Hell with InstallShield, too. At the time, most builds shared basic commonalities, but almost all of them were lovingly handcrafted. A good build engineer was worth his or her weight in gold because they were the foundation of the team’s ability to produce working software.

Five years after Ant was released, Maven was released, and with it came a giant leap forward for build automation. Like Make, Ant was basically build automation through scripting, albeit scripting that hooked into Java code. Maven, on the other hand brought with it project management and dependency management in addition to build automation. Maven also heavily favored the use of conventions. So, unlike Ant where “ant” could mean different things to different projects based on whatever what was in the build.xml (which could be anything), with Maven “mvn install” was a phase that had meaning attached to it.

Maven’s dependency management is still the de-facto standard for Java projects and since its introduction, similar dependency management functionality has been introduced for other languages and platforms, like NuGet for.Net. Building upon Maven’s dependency management functionality, and borrowing similar project management functionality, Gradle was released 2 years after Maven’s initial release. Gradle combined Groovy scripting with a convention-based project approach to build automation. Gradle gained more prominence in 2014, with the release of Google’s Android Studio with Gradle-based build support.

It should be noted that in terms of dependency management, Maven was not the first. For example, the introduction of PyPI predates Maven by 2 years.

A lot of Java build automation tooling was discussed above and for a good reason. Although, Java doesn’t enjoy the language popularity it once did, Java and JVM-based languages are still quite popular. Java has also been around since 1995. So, it’s a great example to illustrate how build automation tools have evolved over the past 20 years.

Modern Challenges

With all the advances in build automation tooling, one might think that the problems build automation faced in the past have long been solved. Unfortunately, builds still break, dependencies still go missing and often times a working and productive development environment can take days to set up. Brittle builds are still a reality that are best discussed using an example.

What we need is a way to ensure all the conditions necessary for a successful build can be met with minimal effort.

The Example Project

Let’s assume we have a relatively small system consisting of a REST API written in Java, a Web front-end written in ReactJS, a command line interface (CLI) written in Python and a couple of Terraform configurations that define the infrastructure needed to host the REST API and the Web front-end, respectively. In order to build this system in its entirety, a compatible version of Java is needed, a compatible version of Python is needed, and a compatible version of Node/NPM is needed. Additionally, the necessary Java build automation tool must be installed (probably Maven or Gradle), all code library dependencies must be available. That’s a lot of conditions that must be met prior to being able to build the system. What we need is a way to ensure all the conditions necessary for a successful build can be met with minimal effort.

java-rest-api: build.gradle python-cli: setup.py requirements.txt js-frontend: package.json tf-api-host: terrafile tf-frontend-host: terrafile

Hierarchical Representation of Project

Developer Virtual Workstations

Providing every developer a virtual workstation that’s 100% bootstrapped with all the tools and configuration they need to successfully checkout and build the entire system is one way to both speed up onboarding and guard against a mucked up development environment. Tools like Vagrant make this option both viable and streamlined.

Using this approach, an image can either be packaged/baked and published or configured using a published Vagrantfile. Additionally, developers can personalize their environment through extensions or by modifying their Vagrantfile to suit their preferences. If for some reason a “doesn’t build on my workstation” affliction occurs, a working environment is just a new vagrant up away.

A similar option is the use of VMs in the Cloud. Just be aware of network connectivity and lag. Waiting 5 seconds for an image to load in a browser is acceptable. Waiting 5 seconds for a keystroke to register on the command line is not.

Single Point of Entry Build Automation

The drawback to resolving environment dependencies solely with developer virtual workstations is that the build server and the developer virtual workstations must be kept in sync, which may or may not be a headache. But there are few things more frustrating than having a working build on your developer workstation, only to find out that it doesn’t work on the build server because it’s got a different version of some dependency installed. Additionally, if the development environment is virtual then it may not be as performant as it would be on bare metal.

Having a single point of entry build automation solution could be for a bare metal developer workstation or it could be used with a developer virtual workstation. To better understand single point of entry build automation, let’s first consider how a build might work assuming any working development environment.

The java-rest-api project, the js-frontend project, the python-cli project, and the Terraform projects must all be built independently. That’s 5 different build entry points for a single system. Then, there is still the question of “Which dependencies are installed on the build server?”

What if your version of NodeJS is different than the build server’s version of NodeJS? What if your version of Terraform is different than the build server’s version of Terraform? What about your version of Maven or Gradle or Python?

Ex. Gradle as the Single Point of Entry

Both Maven and Gradle support the concept of multi-project builds. Consider the updated project structure below.

build.gradle settings.gradle gradlew java-rest-api: build.gradle python-cli: build.gradle setup.py requirements.txt js-frontend: build.gradle package.json tf-api-host: build.gradle terrafile tf-frontend-host: build.gradle terrafile

What if we could create a multi-project build like the one above and build the whole thing using Gradle from the parent project? What if we could also ensure that the versions of Gradle, Node, Terraform and Python used were always consistent? With Gradle multi-project builds, a single command gradlew build at the root can build all the sub-projects. What build means can also be adjusted to suit the type of project.

As far as project and language support, there is a Gradle Wrapper, that ensures all builds use the same version of Gradle. Gradle also has an integration with Node NPM that will ensure specific versions of Node NPM are used. Similarly, there is also a Gradle plugin for building Python.

There is not, however, a Gradle plugin for building Terraform. But there is a Maven plugin for Terraform, which also publishes a Java API for Terraform that could be used with Gradle.

By going down this route, there is still an external dependency on Java (Gradle, after all, is a Java-based tool). Gradle can, however, validate that the correct version of Java is available as part of the build. And everything else (what software, which versions, etc.) is dependent upon the build configuration stored with the project, not the developer’s workstation and not the build server. The build configuration is checked into source control with the project. Where the source code goes, it goes.


Few things suck the productivity (and joy) out of software development like brittle build automation. It’s in everyone’s interest to ensure that building the system from top to bottom can be done quickly, easily and with as few external software dependencies as possible. One way to eliminate the burden of external software dependencies and to ensure a complete and functional development environment is to automate it through something like Vagrant.

However, an automated development environment doesn’t eliminate the need for a “batteries included” build. Multi-project build organizations, and the appropriate use of wrappers and plugins can provide single point of entry build automation that can bootstrap its own software dependencies. Being able to clone a repo consisting of multiple projects, languages and technologies, and with a single command, reliably them all is both possible and reasonable.

Don’t you think the savings you can gain in time and frustration is well worth little effort required to make it happen?

Updated: March 14, 2021

batteries included: Platform.sh configuration libraries

Platform.sh, like any good PaaS, exposes a lot of useful information to applications via environment variables. The obvious parts, of course, are database credentials, but there’s far more that we make available to allow an application to introspect its environment.

Sometimes those environment variables aren’t as obvious to use as we’d like. Environment variables have their limits, such as only being able to store strings. For that reason, many of the most important environment variables are offered as JSON values, which are then base64-encoded so they fit nicely into environment variables. Those are not always the easiest to read.

That’s why we’re happy to announce all new, completely revamped client libraries for PHP, Python, and Node.js to make inspecting the environment as dead-simple as possible.


All of the libraries are available through their respective language package managers:

composer install platformsh/config-reader
pip install platformshconfig
npm install platformsh-config.-save


All three libraries work the same way, but are flavored for their own language. All of them start by instantiating a config object. That object then offers methods to introspect the environment in intelligent ways.

For instance, it’s easy to tell if a project is running on Platform.sh, in the build hook or not, or if it’s in a Platform.sh Enterprise environment. In PHP:

config = new \Platformsh\ConfigReader\Config; config-inValidPlatform; // True if env vars are available at all. config-inBuild; config-inRuntime; config-onEnterprise; config-onProduction; // Individual Platform.sh environment variables are available as their own properties, too. config-applicationName; config-port; config-project; //.

The onProduction method already takes care of the differences between Platform.sh Professional and Platform.sh Enterprise and will return true in either case.

What about the common case of accessing relationships to get credentials for connecting to a database? Currently, that requires deserializing and introspecting the environment blob yourself. But with the new libraries, it’s reduced to a single method call. In Python:

config = platformshconfig.Config creds = config.credentials(‘database’)

This will return the access credentials to connect to the database relationship. Any relationship listed in.platform.app.yaml is valid there.

What if you need the credentials formatted a particular way for a third-party library? Fortunately, the new clients are extensible. They support credential formatters, which are simply functions (or callables, or lambdas, or whatever the language of your choice calls them) that take a relationship definition and format it for a particular service library.

For example, one of the most popular Node.js libraries for connecting to Apache Solr, solr-node wants the name of a collection as its own string. The Platform.sh relationship provides a path, since there are other libraries that use a path to connect. Rather than reformat that string inline, the Node.js library includes a formatter specifically for solr-node :

const solr = require(‘solr-node’); const config = require(platformsh-config).config; let client = new solr(config.formattedCredentials(‘solr-relationship-name’, ‘solr-node’));

Et voila. client is now a solr-node client and is ready to be used. It’s entirely possible to register your own formatters, too, and third-party libraries can include them as well:

config.registerFormatter(‘my-client-library’, (creds) = // Do something here to return a string, struct, dictionary, array, or whatever. );

We’ve included a few common formatters in each library to cover some common libraries. We’ll be adding more as time goes by, and, of course, PRs are always extremely welcome to add more!

But what about my language?

We wanted to get these three client libraries out the door and into your hands as soon as possible. But don’t worry; Go and Ruby versions are already in the works and will be released soon.

We’ll continue to evolve these new libraries, keeping the API roughly in sync between all languages, but allowing each to feel as natural as possible for each language.

Python Package Managers Explained

Python has become one of the most popular programming languages, thanks to its ease of use and extreme versatility. It has an extensive standard library that comes with “batteries included” making it a powerful tool for all kinds of Python users. From data scientists to network engineers, there’s a Python library for everyone.

What makes Python a true power tool is the ecosystem of free and open-source libraries like Tensorflow, Netmiko, and Flask. These can be installed with a single command using a package manager.

In this article, I’ll explain where all these great packages can be found, how Python’s standard package manager works, and some challenges and solutions to be aware of when using Python.

PyPI: The Package Index

Similar to NuGet.org Npmjs.org, Python also has its own official third-party software repository. The Python Package Index (PyPI) is a repository of software that hosts an extensive collection of Python packages, development frameworks, tools, and libraries.

PyPI packages allow developers to share and reuse code rather than having to reinvent the wheel. As PyPI grew, the need for a package manager became so apparent that Python eventually created its own standard package manager: pip.

Pip: The Standard Package Manager

Pip is built-in into Python and can install packages from many different sources. But PyPI.org is the primary and default package source used.

By default, pip installs packages onto a project’s global Python environment resulting in packages being accessible by all projects. This can be an issue due to packages being dependent on specific versions of other packages. Since all packages are in a global environment, it’s easy to run into a dependency conflict that may prevent your application from building.

Thankfully, pip automates package management by first resolving all dependencies and then proceeding to install the request packages. However, the standard method for preventing dependency conflicts is to create separate Python environments for each project.

Virtual Environments Virtualenv

In the Python world, a virtual environment is a folder containing packages and other dependencies that a Python project needs. The purpose of these environments is to keep projects separate and prevent dependency, version, and permission conflicts.

Imagine a script that relies on 1.10 of the package NumPy, but a different script requires version 1.20. This is a slight problem because there’s a breaking change in 1.19. If you install everything into a global python environment (e.g. the default pip setting) then one of these scripts might not work.

Virtualenv is a tool that allows the creation of named virtual environments where you can install packages in an isolated manner. Each environment has its own installation directories and doesn’t share libraries with other virtual environments (including globally installed libraries).

For example, one environment for web development and a different environment for data science can be created with their own set of libraries.

Pip Alternatives (Pipenv Poetry)

Pip is the “original” python package manager that others have attempted to improve upon. Pipenv Poetry are two package managers that have done this with great success.

Pipenv is a package management tool that “aims to bring the best of all packaging worlds” to Python. Pipenv is similar in spirit to Node.js’s npm and Ruby’s bundler. It’s popular among the Python community because it merges virtual environments and package management into a single tool. While pip is sufficient for personal use, Pipenv is recommended for collaborative projects as it’s a higher-level tool that simplifies dependency management for common use cases and can create virtual environments.

Poetry prides itself on making Python packaging and dependency management “easy”. Besides package management, it can help build distributions for applications and deploy them to PyPI. It also allows the declaration of the libraries a project depends on and installs/updates them avoiding any conflicting package requirements. Furthermore, Poetry isolates development versus production dependencies into separate virtual environments.

Conda: Alternative Package Management

Conda is a multi-purpose package management tool. It manages package dependencies, can create virtual environments for applications, installs compatible Python distributions, and packages applications for deployment to production. It originated from Anaconda, which started as a data science package for Python. Conda installs packages from Anaconda rather than PyPI and can be used with multiple programming languages.

Compared to Pip, the package selection is much smaller, but what Conda lacks in quantity it makes up for in quality. Anyone can publish to PyPI, but only packages curated by Anaconda are published in its repository. While Anaconda requires a paid subscription, it grants access to thousands of curated packages and provides support as well. Conda is an ideal package manager for those that are willing to pay to not worry about the license, quality, and vulnerability issues when dealing with third-party/open-source packages.

Getting Started with Python Package Managers

Pip is the ideal starting place. It comes with Python, is easy to understand, and has an abundance of related resources. However, if you’re working on anything more than a personal project you will likely need to create virtual environments. For that, Pipenv and Poetry are more convenient options than using pip and Virtualenv together.

Alternatively, Conda can be used as a swiss army knife package management tool. It has everything you need in one tool and access to packages curated by Anaconda. However, it requires a paid subscription, its repository (Anaconda) has significantly fewer packages, and it has fewer related resources available than PyPI and its package managers.

Managing Python packages is only the tip of the iceberg when it comes to using Python in the enterprise. Read our guide to learn how to master Python in the enterprise!

Leave a Comment