Python Engineering (4): Type Hints, Linting, and Code Quality
Add type safety with mypy, enforce style with ruff and black, and automate checks with pre-commit hooks. Make code reviews about logic, not formatting.
Code reviews should be about logic and design, not about whether someone used single quotes or double quotes. Formatting debates are a waste of engineering time. The solution is to let machines handle style and let humans focus on correctness.
This article covers three layers of automated code quality: type hints catch logical errors before runtime, linters catch style violations and common bugs, and pre-commit hooks enforce everything automatically on every commit.
Python is dynamically typed, but since 3.5 it supports optional type annotations. They do not affect runtime behavior. They are metadata that tools like mypy can check.
fromtypingimportOptional,Union# Optional means "this type or None"deffind_user(user_id:int)->Optional[dict]:"""Returns user dict or None if not found."""...# Union means "one of these types"defprocess(value:Union[str,int])->str:returnstr(value)# Python 3.10+ shorthanddeffind_user(user_id:int)->dict|None:...defprocess(value:str|int)->str:returnstr(value)
fromtypingimportAny,Callable,Iterator# Any disables type checking for this valuedeflog(message:Any)->None:print(message)# Callable[[arg_types], return_type]defretry(func:Callable[[str],bool],attempts:int=3)->bool:for_inrange(attempts):iffunc("test"):returnTruereturnFalse# Iterator and Generatordefcount_up(start:int,end:int)->Iterator[int]:current=startwhilecurrent<end:yieldcurrentcurrent+=1
When you need a function that works with any type but preserves the relationship:
1
2
3
4
5
6
7
8
9
10
11
fromtypingimportTypeVar,SequenceT=TypeVar("T")deffirst(items:Sequence[T])->T:"""Return the first item. Type of return matches type of items."""returnitems[0]# mypy knows these types:x:int=first([1,2,3])# T = inty:str=first(["a","b","c"])# T = str
fromtypingimportTypeVar# T must be a subclass of int or floatNumeric=TypeVar("Numeric",int,float)defadd(a:Numeric,b:Numeric)->Numeric:returna+badd(1,2)# OK: intadd(1.0,2.0)# OK: floatadd("a","b")# Error: str is not int or float
Protocol defines an interface by structure, not inheritance. If it has the right methods, it matches:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
fromtypingimportProtocol,runtime_checkable@runtime_checkableclassReadable(Protocol):defread(self,size:int=-1)->bytes:...defprocess_stream(source:Readable)->bytes:"""Accepts anything with a .read() method."""returnsource.read()# This works with any object that has .read(), without inheriting Readableimportiodata=process_stream(io.BytesIO(b"hello"))# OK
fromtypingimportTypedDictclassUserRecord(TypedDict):name:strage:intemail:strclassUserRecordPartial(TypedDict,total=False):name:strage:intemail:str# all fields are optionaldefcreate_user(data:UserRecord)->int:# mypy knows data["name"] is str, data["age"] is int...# OKcreate_user({"name":"Alice","age":30,"email":"a@b.com"})# Error: missing "email"create_user({"name":"Alice","age":30})
# Error: Incompatible return value type (got "Optional[str]", expected "str")defget_name(user_id:int)->str:result=lookup(user_id)# returns Optional[str]returnresult# Error!# Fix: handle the None casedefget_name(user_id:int)->str:result=lookup(user_id)ifresultisNone:raiseValueError(f"User {user_id} not found")returnresult# Now mypy knows result is str# Error: Item "None" of "Optional[dict]" has no attribute "get"defget_email(user:dict|None)->str:returnuser.get("email","")# Error: user might be None# Fix: narrow the typedefget_email(user:dict|None)->str:ifuserisNone:return""returnuser.get("email","")# Error: Need type annotation for "items"items=[]# mypy doesn't know the element type# Fix: annotateitems:list[str]=[]# Error: Argument 1 to "open" has incompatible type "Optional[str]"defread_file(path:str|None)->str:withopen(path)asf:# Error: path might be Nonereturnf.read()# Fix: check first or change the typedefread_file(path:str)->str:withopen(path)asf:returnf.read()
fromtypingimportParamSpec,TypeVar,CallablefromfunctoolsimportwrapsimporttimeP=ParamSpec("P")R=TypeVar("R")deftiming(func:Callable[P,R])->Callable[P,R]:"""Decorator that logs execution time without losing type info."""@wraps(func)defwrapper(*args:P.args,**kwargs:P.kwargs)->R:start=time.perf_counter()result=func(*args,**kwargs)elapsed=time.perf_counter()-startprint(f"{func.__name__} took {elapsed:.3f}s")returnresultreturnwrapper@timingdeffetch_user(user_id:int,include_posts:bool=False)->dict:...# mypy knows: fetch_user(user_id=42, include_posts=True) -> dict# mypy catches: fetch_user("wrong") # Error: str is not int
Without ParamSpec, decorated functions lose their type signatures and mypy treats them as (*args: Any, **kwargs: Any) -> Any.
TypeVarTuple (Python 3.11+) types functions that accept a variable number of typed arguments:
1
2
3
4
5
6
7
8
9
10
fromtypingimportTypeVarTuple,UnpackTs=TypeVarTuple("Ts")deffirst_of(*args:Unpack[Ts])->tuple[Unpack[Ts]]:returnargs# Type checker knows:result=first_of(1,"hello",3.14)# result: tuple[int, str, float]
Practical use case — typed middleware chains:
1
2
3
4
5
6
7
8
fromtypingimportTypeVarTuple,Generic,Unpack,CallableTs=TypeVarTuple("Ts")classPipeline(Generic[Unpack[Ts]]):"""Type-safe pipeline where each stage's output feeds the next."""def__init__(self,*stages:Unpack[Ts]):self.stages=stages
fromtypingimportTypeGuard,TypeIsdefis_string_list(val:list[object])->TypeGuard[list[str]]:"""After this returns True, mypy knows val is list[str]."""returnall(isinstance(x,str)forxinval)defprocess(data:list[object]):ifis_string_list(data):# mypy knows: data is list[str] hereprint(", ".join(data))# No error
TypeIs (Python 3.13+) is stricter than TypeGuard — it narrows in both the if and else branches.
Explicitly mark methods that override a parent class method. mypy will error if the parent method is renamed or removed:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
fromtypingimportoverrideclassAnimal:defspeak(self)->str:return"..."classDog(Animal):@overridedefspeak(self)->str:return"Woof"@overridedefeat(self)->None:# Error: Animal has no method 'eat'...
fromfastapiimportFastAPIfrompydanticimportBaseModelapp=FastAPI()classItemCreate(BaseModel):name:strprice:float=Field(gt=0)tags:list[str]=[]classItemResponse(BaseModel):id:intname:strprice:floattags:list[str]@app.post("/items",response_model=ItemResponse)asyncdefcreate_item(item:ItemCreate)->ItemResponse:# item is already validated by Pydanticsaved=awaitdb.save(item.model_dump())returnItemResponse(id=saved.id,**item.model_dump())
Guideline: Use Pydantic at system boundaries (API inputs, config files, external data). Use dataclasses for internal value objects where you trust the data.
ruff is a Python linter written in Rust. It is 10-100x faster than flake8 and replaces flake8, isort, pyflakes, pycodestyle, pydocstyle, and many flake8 plugins in a single tool.
# pyproject.toml[tool.ruff]target-version="py311"line-length=88[tool.ruff.lint]select=["E",# pycodestyle errors"W",# pycodestyle warnings"F",# pyflakes"I",# isort"N",# pep8-naming"UP",# pyupgrade"B",# flake8-bugbear"SIM",# flake8-simplify"C4",# flake8-comprehensions"DTZ",# flake8-datetimez"T20",# flake8-print (no print in prod code)"RET",# flake8-return"PTH",# flake8-use-pathlib"ERA",# eradicate (commented-out code)"RUF",# ruff-specific rules]ignore=["E501",# line too long (handled by formatter)][tool.ruff.lint.per-file-ignores]"tests/*"=["T20","S101"]# allow print and assert in tests[tool.ruff.lint.isort]known-first-party=["my_tool"]
# F401: imported but unusedimportos# <-- ruff removes this# F841: local variable assigned but never useddefprocess():result=compute()# <-- ruff flags thisreturnNone# B006: mutable default argumentdefappend_to(item,target=[]):# <-- Bug! Shared mutable defaulttarget.append(item)returntarget# SIM108: use ternary instead of if-elseifcondition:x=1else:x=2# ruff suggests: x = 1 if condition else 2# UP035: use PEP 604 union syntaxfromtypingimportOptional# <-- ruff suggests: str | Nonedeff(x:Optional[str]):...# C4: use dict/list comprehensiondict([(k,v)fork,vinitems])# <-- ruff suggests: {k: v for k, v in items}
black is an opinionated code formatter. It makes style decisions for you so you never argue about formatting again.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(.venv) $ pip install black
# Check without modifying(.venv) $ black --check src/
would reformat src/my_tool/core.py
Oh no!
1 file would be reformatted.
# Show what would change(.venv) $ black --diff src/my_tool/core.py
# Format in place(.venv) $ black src/
reformatted src/my_tool/core.py
All done!
1 file reformatted.
Note: ruff format is now a drop-in replacement for black, so you can skip installing black separately and just use ruff format.
(.venv) $ pre-commit install
pre-commit installed at .git/hooks/pre-commit
# Run against all files (first time)(.venv) $ pre-commit run --all-files
ruff.....................................................................Passed
ruff-format..............................................................Passed
mypy.....................................................................Passed
trailing whitespace......................................................Passed
fix end of files.........................................................Passed
check yaml...............................................................Passed
check toml...............................................................Passed
check for added large files..............................................Passed
debug statements.........................................................Passed
Now every git commit runs these checks. If ruff or black reformats a file, the commit fails and you need to git add the reformatted file and commit again.
Your code is now type-safe, consistently formatted, and automatically checked on every commit. But Python programs do more than compute; they read files, parse configs, and serialize data in a dozen formats. In the next article, we will master I/O, tackle encoding headaches, and compare every serialization format from JSON to Parquet.