My Most Used Python Packages

a list of my most used python packages

What kind of tasks do I use Python for?

I use Python for a variety of research tasks, including data analysis, machine learning, and scientific computing. My current research involves developing machine learning models for medical image analysis, which requires a combination of medical image processing, deep learning, and graphical visualization. Python is a versatile language that allows me to work across these different domains and integrate various libraries and tools.

In this post I will list some of the Python packages that I use most frequently in my research but will leave out the most common ones like numpy, pandas, matplotlib, and scikit-learn. Those are essential packages, but I want to focus on some of the more specialized packages that I use.

1: SimpleITK

SimpleITK is a simplified layer built on top of the Insight Segmentation and Registration Toolkit (ITK). It is a powerful and easy-to-use library for medical image analysis and visualization. SimpleITK provides a high-level interface to ITK, allowing users to perform complex image processing tasks with minimal code. I use SimpleITK for loading, processing, and visualizing medical image data, as well as for implementing various image segmentation methods.

Here is a code snippet showing how to load and visualize a 3D medical image using SimpleITK:

import SimpleITK as sitk
import matplotlib.pyplot as plt

# Load the image
image_path = 'path/to/image.nii.gz'
image = sitk.ReadImage(image_path)

# Convert the image to a numpy array
image_array = sitk.GetArrayFromImage(image)

# Plot the image
plt.imshow(image_array[100, :, :], cmap='gray')
plt.show()

2: PyTorch

PyTorch is an open-source machine learning library developed by Facebook’s AI Research lab. It provides a flexible and efficient framework for building and training deep learning models. PyTorch’s dynamic computation graph and automatic differentiation capabilities make it well-suited for research and prototyping. I use PyTorch for developing and training deep learning models for all kinds of tasks, as well as for implementing custom loss functions and optimization algorithms. I used to use TensorFlow, but I have found PyTorch to be easier to use and more flexible.

Here is a code snippet showing how to define and train a simple neural network in PyTorch:

import torch
import torch.nn as nn
import torch.optim as optim

# Define the neural network
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 784)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Create the neural network
net = SimpleNet()

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

# Train the neural network
for epoch in range(10):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print(f'Epoch {epoch + 1}, loss: {running_loss / len(trainloader)}')

3: VTK

The Visualization Toolkit (VTK) is an open-source software system for 3D computer graphics, image processing, and visualization. It provides a wide range of tools for creating interactive visualizations of scientific data, including medical images. I use VTK for visualizing 3D medical image data, as well as for creating different representation for geometric data (voxels, meshes, point clouds etc). Since my work involves a lot of 3D medical image data, meshes, and point clouds, VTK is an essential tool for me. For most purposes, I use vtk polydata, which is a data structure that represents a geometric object as well as its associated data.

Here is a code snippet showing how to create and visualize a 3D mesh using VTK:

import vtk

# Create a sphere
sphere = vtk.vtkSphereSource()
sphere.SetRadius(1.0)

# Create a mapper
mapper = vtk.vtkPolyDataMapper()
mapper.SetInputConnection(sphere.GetOutputPort())

# Create an actor
actor = vtk.vtkActor()
actor.SetMapper(mapper)

# Create a renderer
renderer = vtk.vtkRenderer()
renderer.AddActor(actor)
renderer.SetBackground(0.1, 0.2, 0.4)

# Create a render window
render_window = vtk.vtkRenderWindow()
render_window.AddRenderer(renderer)

# Create an interactor
interactor = vtk.vtkRenderWindowInteractor()
interactor.SetRenderWindow(render_window)

# Start the visualization
interactor.Start()

4: Fenics

FEniCS is a popular open-source computing platform for solving partial differential equations (PDEs). It provides a flexible and efficient framework for solving a wide range of PDEs, including those arising in fluid dynamics, solid mechanics, and electromagnetics. I use FEniCS for simulating blood flow in patient-specific vascular geometries, as well as for solving various other PDEs in my research. FEniCS provides a high-level interface for defining and solving PDEs, making it easy to implement and test different numerical methods.

Here is a code snippet showing how to solve the Poisson equation using FEniCS:

from fenics import *

# Create a mesh
mesh = UnitSquareMesh(8, 8)

# Define the function space
V = FunctionSpace(mesh, 'P', 1)

# Define the boundary condition
u_D = Expression('1 + x[0]*x[0] + 2*x[1]*x[1]', degree=2)
bc = DirichletBC(V, u_D, 'on_boundary')

# Define the variational problem
u = TrialFunction(V)
v = TestFunction(V)
f = Constant(-6.0)
a = dot(grad(u), grad(v))*dx
L = f*v*dx

# Compute the solution
u = Function(V)
solve(a == L, u, bc)

# Plot the solution
plot(u)

5: OpenCV

OpenCV is an open-source computer vision and machine learning software library. It provides a wide range of tools for image processing, computer vision, and machine learning. I use OpenCV for various image processing tasks, such as image denoising, feature detection, and object tracking.

Here is a code snippet showing how to denoise an image using OpenCV:

import cv2

# Read the image
image = cv2.imread('path/to/image.jpg')

# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Denoise the image
denoised = cv2.fastNlMeansDenoising(gray, None, h=10, templateWindowSize=7, searchWindowSize=21)

# Display the original and denoised images
cv2.imshow('Original', gray)
cv2.imshow('Denoised', denoised)
cv2.waitKey(0)
cv2.destroyAllWindows()

6: VMTK

The Vascular Modeling Toolkit (VMTK) is an open-source software system for 3D geometric modeling of blood vessels. It provides a wide range of tools for processing and analyzing medical image data, as well as for creating patient-specific vascular models. I use VMTK for extracting and processing blood vessel geometries from medical image data, as well as for creating different representations of vascular structures.

Here is a code snippet showing how to extract the centerline of a blood vessel using VMTK:

from vmtk import vmtkscripts

# Load the mesh
reader = vmtkscripts.vmtkSurfaceReader()
reader.InputFileName = 'path/to/mesh.vtp'
reader.Execute()

# Extract the centerline
centerline = vmtkscripts.vmtkCenterlines()
centerline.Surface = reader.Surface
centerline.Execute()

# Save the centerline
writer = vmtkscripts.vmtkSurfaceWriter()
writer.Surface = centerline.Centerlines
writer.OutputFileName = 'path/to/centerline.vtp'
writer.Execute()

7. Multiprocessing

Multiprocessing is a built-in Python module that provides support for concurrent execution using processes. I use the multiprocessing module to parallelize computationally intensive tasks, such as image processing, data analysis, and model training. By distributing the workload across multiple processes, I can take advantage of multi-core processors and reduce the overall execution time of my research tasks. It is super easy to use and can be a huge time saver.

Here is a code snippet showing how to use the multiprocessing module to parallelize a simple task:

import multiprocessing

# Define the task
def process_image(image_path):
    # Load and process the image
    ...

# Create a pool of worker processes
pool = multiprocessing.Pool()

# Process a list of images in parallel
image_paths = ['image1.jpg', 'image2.jpg', 'image3.jpg']
pool.map(process_image, image_paths)

# Close the pool of worker processes
pool.close()
pool.join()