Index ¦ Archives ¦ RSS > Tag: python

APSW 3.37 is the last with Python 2 / early Python 3 support

This release of APSW (my Python wrapper for SQLite's C interface) is the last that will support Python 2, and earlier versions of Python 3 (before 3.7).

If you currently use APSW with Python 2/early Python 3, then you will want to pin the APSW version to 3.37. You will still be able to use this version of APSW with future versions of SQLite (supported till 2050). But new C level APIs won't be covered. The last C level API additions were 3.36 in June 2021 adding serialization and 3.37 in December 2021 adding autovacuum control

What does APSW support now ...

APSW supports every Python version 2.3 onwards (released 2003). It doesn't support earlier versions as there was no GIL API (needed for multi-threading support).

The downloads for the prebuilt Windows binary gives an idea of just how many Python versions that is (15). (Python 3.0 does actually work, but is missing a module used by the test suite.)

Many Python versions supported ...

Each release does involve building and testing all the combinations of 15 Python versions, 32 and 64 bit environment, and both UCS2 and UCS4 Unicode size for Python < 3.3, on multiple operating systems.

There are ~13k lines of C code making up APSW, with ~7k lines of Python code making up the test suite. It is that test suite that gives the confidence that all is working as intended.

... and why?

I wanted to make sure that APSW is the kind of module I would want to use. The most frustrating thing as a developer is that you want to change one thing (eg one library) and then find that forces you to change the versions of other components, or worse the runtime and dev tools (eg compiler).

I never made the guarantee, but it turned out to be:

You can change the APSW (and SQLite) versions, and nothing else. No other changes will be required and everything will continue to work, probably better.

This would apply to any project otherwise untouched since 2004!

There are two simple reasons:

  • Because I could - I do software development for a living, and not breaking things is a good idea (usually)
  • I would have to delete code that works

What happens next?

I am going to delete code that works, but it is mainly in blocks saying doing one thing for Python 2, another for early Python 3, and another for current Python 3.

My plan is to incrementally remove Python 2/early 3 code from the Python test suite and the C code base together, while updating documentation (only Python 3 types need to be mentioned). The test suite and coverage testing will hopefully catch any problems early.

I will be happy that the code base, testing, documentation, and tooling will all become smaller. That makes things less complex.

Other thoughts

The hardest part of porting APSW from Python 2 to 3 was the test suite which had to remain valid to both environments. For example it is easy to create invalid Unicode strings in Python 2 which I had to make sure the test suite checked.

It was about 10 times the amount of work making the Python test suite code changes, vs the C level API work. Python 3 wasn't that much different in terms of the C API (just some renaming and unification of int and long etc).

Category: misc – Tags: apsw, python


Exit Review: Python 2 (and some related thoughts)

Python 2 has come to an end. I ported the last of my personal scripts to Python 3 a few months ago.

Perhaps the greatest feature of Python 2 was that after the first few releases, it stayed stable. Code ran and worked. New releases didn't break anything. It was predictable. And existing Python 2 code won't break for a long time.

The end of Python 2 has led to the end of that stability, which isn't a bad thing. Python 3 is now competing across a broader ecosystem of languages and environments trying to improve developer and runtime efficiency. Great!

I did see a quote that Python is generally the second best solution to any problem. That is a good summary, and shows why Python is so useful when you need to solve many different problems. It is also my review of Python 2.

So let's have some musings ...

Python has had poor timing. The first Python release (1994) was when unicode was being developed, so the second major Python version (2000) had to bolt on unicode support. But if it had waited a few more years, then things could have been simpler by going straight to utf8 (see also PEP 0538).

Every language has been adding async with Python 3 (2008) increasing support with each minor release. However like most other languages, functions ended up coloured. This will end up solved, almost certainly by having the runtime automagically doing the right thing.

Python 3 made a big mistake with the 2to3 tool. It works exactly as described. But it had the unfortunate effect of maintainers keeping their code in Python 2, and using that to make releases that supported both Python 2 and 3. The counter-example is javascript where tools provide the most recent syntax and transpiling to support older versions. Hopefully future Python migration tools will follow the same pattern so that code can be maintained in the most recent release, and transpiled to support older versions. This should also be the case for using the C API.

The CPython C API is quite nice for a C based object API. Even the internal objects use it. It followed the standard pattern of the time with an object (structure) pointer and methods taking it as a parameter. There are also macros for "optimised access". But this style makes changing underlying implementation details difficult, as alternate Python interpeter implementations have found out. If for example a handle based API was used instead, then it would have been slower due to an indirection, but allow easier changing of implementation details.

Another mistake was not namespacing the third party package repository PyPI. Others have made the same mistake. For example when SourceForge was a thing, they did not use namespacing so the urls were sf.net/projectname - which then led to issues over who legitimately owned projectname. Github added namespaces so the urls are github.com/user/projectname. (user can also be an organization.) This means the same projectname can exist many times over. That makes forking really easy, and is perhaps one of the most important software freedoms.

Using NPM as an example, this is the only package that can be named database. It hasn't been updated in 6 years. On PyPI this is apsw and hasn't been updated in 5 years. (I am the apsw author updating it about quarterly but not the publisher on PyPI for reasons.) Go does use namespacing. A single namespace prevents forks (under the same name) and also makes name squatting very easy. Hopefully Python will figure out a nice solution.

Category: misc – Tags: exit review, python

Contact me