diff mbox series

[27/27] mkvenv.py: experiment; use distlib to generate script entry points

Message ID 20230511035435.734312-28-jsnow@redhat.com (mailing list archive)
State New, archived
Headers show
Series configure: create a python venv and ensure meson, sphinx | expand

Commit Message

John Snow May 11, 2023, 3:54 a.m. UTC
This is an experiment: by using pip's internal vendored distlib, we can
generate script entry points for Windows, Mac and Linux using distlib's
mechanisms. This is the same mechanism used when running "pip install",
so it will be consistent with the most common method of package
installation on all platforms. It also allows us to delete a good bit of
vendored/borrowed code from inside of mkvenv.py, so there's less to
maintain and the license might be more straightforward.

As a downside, if we're not willing to actually add a distlib
requirement, we have to piggy-back on pip's vendored version, which
could have downsides if they move our cheese in the future.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 configure                |  16 +-----
 python/scripts/mkvenv.py | 117 +++++++++++++--------------------------
 python/setup.cfg         |   7 ++-
 3 files changed, 46 insertions(+), 94 deletions(-)

Comments

Paolo Bonzini May 11, 2023, 6:57 a.m. UTC | #1
On 5/11/23 05:54, John Snow wrote:
> This is an experiment: by using pip's internal vendored distlib, we can
> generate script entry points for Windows, Mac and Linux using distlib's
> mechanisms. This is the same mechanism used when running "pip install",
> so it will be consistent with the most common method of package
> installation on all platforms. It also allows us to delete a good bit of
> vendored/borrowed code from inside of mkvenv.py, so there's less to
> maintain and the license might be more straightforward.
> 
> As a downside, if we're not willing to actually add a distlib
> requirement, we have to piggy-back on pip's vendored version, which
> could have downsides if they move our cheese in the future.

I think the downsides are limited since you're trying both the "real" 
distlib and pip's.  Overall I like this, though I haven't made up my 
mind if it should be split and squashed earlier in the series.

Paolo
Paolo Bonzini May 11, 2023, 7:02 a.m. UTC | #2
On 5/11/23 05:54, John Snow wrote:
> +    if checkpip():
> +        # We ran ensurepip. We need to re-run post_init...!
> +        args = [sys.executable, __file__, "post_init"]
> +        subprocess.run(args, check=True)
> +        return
> +
>       # Finally, generate a 'pip' script so the venv is usable in a normal
>       # way from the CLI. This only happens when we inherited pip from a
>       # parent/system-site and haven't run ensurepip in some way.

Can't this just be:

+    if not checkpip():
+        # Finally, generate a 'pip' script so the venv is usable in a normal
+        # way from the CLI. This only happens when we inherited pip from a
+        # parent/system-site and haven't run ensurepip in some way.
+        generate_console_scripts(["pip"])

even squashed in the original Debian 10 patch?

You don't need to generate the pip shims if ensurepip has been run.

Paolo
John Snow May 11, 2023, 3:58 p.m. UTC | #3
On Thu, May 11, 2023, 3:02 AM Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 5/11/23 05:54, John Snow wrote:
> > +    if checkpip():
> > +        # We ran ensurepip. We need to re-run post_init...!
> > +        args = [sys.executable, __file__, "post_init"]
> > +        subprocess.run(args, check=True)
> > +        return
> > +
> >       # Finally, generate a 'pip' script so the venv is usable in a
> normal
> >       # way from the CLI. This only happens when we inherited pip from a
> >       # parent/system-site and haven't run ensurepip in some way.
>
> Can't this just be:
>
> +    if not checkpip():
> +        # Finally, generate a 'pip' script so the venv is usable in a
> normal
> +        # way from the CLI. This only happens when we inherited pip from a
> +        # parent/system-site and haven't run ensurepip in some way.
> +        generate_console_scripts(["pip"])
>
> even squashed in the original

Debian 10 patch?
>
> You don't need to generate the pip shims if ensurepip has been run.
>
> Paolo
>

*cough*

You’re completely right. I was so preoccupied on diagnosing why it was
failing I didn't even recognize that it wasn't even needed...

*facepalm*

I'll make that simplifying change, which will also allow me to just put the
import in the global scope instead of trying to do it JIT to work around
ensurepip shenanigans. Should be a few less "I know this is bad" comments
for the linters, too.

--js

>
Paolo Bonzini May 11, 2023, 4:14 p.m. UTC | #4
Il gio 11 mag 2023, 17:58 John Snow <jsnow@redhat.com> ha scritto:

> I'll make that simplifying change, which will also allow me to just put
>> the import in the global scope instead of trying to do it JIT to work
>> around ensurepip shenanigans. Should be a few less "I know this is bad"
>> comments for the linters, too.
>
>
I don't think you can do that because, until you are running in the new
venv, you aren't guaranteed to have either distlib or pip. Once in the venv
you'll get the latter via ensurepip, if it wasn't already present in the
system site-packages.

Paolo

--js
>
>>
John Snow May 11, 2023, 4:16 p.m. UTC | #5
On Thu, May 11, 2023, 12:14 PM Paolo Bonzini <pbonzini@redhat.com> wrote:

>
>
> Il gio 11 mag 2023, 17:58 John Snow <jsnow@redhat.com> ha scritto:
>
>> I'll make that simplifying change, which will also allow me to just put
>>> the import in the global scope instead of trying to do it JIT to work
>>> around ensurepip shenanigans. Should be a few less "I know this is bad"
>>> comments for the linters, too.
>>
>>
> I don't think you can do that because, until you are running in the new
> venv, you aren't guaranteed to have either distlib or pip. Once in the venv
> you'll get the latter via ensurepip, if it wasn't already present in the
> system site-packages.
>
> Paolo
>

Yeah, not without a *little* trickery. It still needs a try/except but it
can be moved up.

Or I could leave well enough alone and worry about cleaning up imports when
we move to 3.8
diff mbox series

Patch

diff --git a/configure b/configure
index ff654a9f45..b559bdc040 100755
--- a/configure
+++ b/configure
@@ -1147,21 +1147,7 @@  fi
 # We ignore PATH completely here: we want to use the venv's Meson
 # *exclusively*.
 
-# "mkvenv ensure" has a limitation compared to "pip install": it is not
-# able to create launcher .exe files on Windows.  This limitation exists
-# because "py.exe" is not guaranteed to exist on the machine (pip/setuptools
-# work around the issue by bundling the .exe files as resources).
-# This is not a problem for msys, since it emulates a POSIX environment;
-# it is also okay for programs that meson.build looks up with find_program(),
-# because in that case Meson checks the file for a shebang line.  However,
-# when meson wants to invoke itself as part of a build recipe, we need
-# to convince it to put the python interpreter in front of the path to the
-# script.  To do so, run it using '-m'.
-if test "$targetos" = windows; then
-  meson="$python -m mesonbuild.mesonmain"
-else
-  meson="$(cd pyvenv/bin; pwd)/meson"
-fi
+meson="$(cd pyvenv/bin; pwd)/meson"
 
 # Conditionally ensure Sphinx is installed.
 
diff --git a/python/scripts/mkvenv.py b/python/scripts/mkvenv.py
index 8e097e4759..8faec0957b 100644
--- a/python/scripts/mkvenv.py
+++ b/python/scripts/mkvenv.py
@@ -63,14 +63,12 @@ 
 import re
 import shutil
 import site
-import stat
 import subprocess
 import sys
 import sysconfig
 from types import SimpleNamespace
 from typing import (
     Any,
-    Dict,
     Iterator,
     Optional,
     Sequence,
@@ -376,7 +374,7 @@  def _stringify(data: Union[str, bytes]) -> str:
     print(builder.get_value("env_exe"))
 
 
-def _gen_importlib(packages: Sequence[str]) -> Iterator[Dict[str, str]]:
+def _gen_importlib(packages: Sequence[str]) -> Iterator[str]:
     # pylint: disable=import-outside-toplevel
     # pylint: disable=no-name-in-module
     # pylint: disable=import-error
@@ -394,14 +392,7 @@  def _gen_importlib(packages: Sequence[str]) -> Iterator[Dict[str, str]]:
             distribution,
         )
 
-    # Borrowed from CPython (Lib/importlib/metadata/__init__.py)
-    pattern = re.compile(
-        r"(?P<module>[\w.]+)\s*"
-        r"(:\s*(?P<attr>[\w.]+)\s*)?"
-        r"((?P<extras>\[.*\])\s*)?$"
-    )
-
-    def _generator() -> Iterator[Dict[str, str]]:
+    def _generator() -> Iterator[str]:
         for package in packages:
             try:
                 entry_points = distribution(package).entry_points
@@ -415,34 +406,17 @@  def _generator() -> Iterator[Dict[str, str]]:
             )
 
             for entry_point in entry_points:
-                # Python 3.8 doesn't have 'module' or 'attr' attributes
-                if not (
-                    hasattr(entry_point, "module")
-                    and hasattr(entry_point, "attr")
-                ):
-                    match = pattern.match(entry_point.value)
-                    assert match is not None
-                    module = match.group("module")
-                    attr = match.group("attr")
-                else:
-                    module = entry_point.module
-                    attr = entry_point.attr
-                yield {
-                    "name": entry_point.name,
-                    "module": module,
-                    "import_name": attr,
-                    "func": attr,
-                }
+                yield f"{entry_point.name} = {entry_point.value}"
 
     return _generator()
 
 
-def _gen_pkg_resources(packages: Sequence[str]) -> Iterator[Dict[str, str]]:
+def _gen_pkg_resources(packages: Sequence[str]) -> Iterator[str]:
     # pylint: disable=import-outside-toplevel
     # Bundled with setuptools; has a good chance of being available.
     import pkg_resources
 
-    def _generator() -> Iterator[Dict[str, str]]:
+    def _generator() -> Iterator[str]:
         for package in packages:
             try:
                 eps = pkg_resources.get_entry_map(package, "console_scripts")
@@ -450,28 +424,11 @@  def _generator() -> Iterator[Dict[str, str]]:
                 continue
 
             for entry_point in eps.values():
-                yield {
-                    "name": entry_point.name,
-                    "module": entry_point.module_name,
-                    "import_name": ".".join(entry_point.attrs),
-                    "func": ".".join(entry_point.attrs),
-                }
+                yield str(entry_point)
 
     return _generator()
 
 
-# Borrowed/adapted from pip's vendored version of distlib:
-SCRIPT_TEMPLATE = r"""#!{python_path:s}
-# -*- coding: utf-8 -*-
-import re
-import sys
-from {module:s} import {import_name:s}
-if __name__ == '__main__':
-    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
-    sys.exit({func:s}())
-"""
-
-
 def generate_console_scripts(
     packages: Sequence[str],
     python_path: Optional[str] = None,
@@ -496,7 +453,7 @@  def generate_console_scripts(
     if not packages:
         return
 
-    def _get_entry_points() -> Iterator[Dict[str, str]]:
+    def _get_entry_points() -> Iterator[str]:
         """Python 3.7 compatibility shim for iterating entry points."""
         # Python 3.8+, or Python 3.7 with importlib_metadata installed.
         try:
@@ -515,34 +472,32 @@  def _get_entry_points() -> Iterator[Dict[str, str]]:
                 "Use Python 3.8+, or install importlib-metadata or setuptools."
             ) from exc
 
+    # Try to load distlib, with a fallback to pip's vendored version.
+    # We perform the loading here, just-in-time, so it may occur after
+    # a potential call to ensurepip in checkpip().
+    # pylint: disable=import-outside-toplevel
+    try:
+        from distlib import scripts
+    except ImportError:
+        try:
+            # Reach into pip's cookie jar:
+            from pip._vendor.distlib import scripts  # type: ignore
+        except ImportError as exc:
+            logger.exception("failed to locate distlib")
+            raise Ouch(
+                "distlib not found, can't generate script shims."
+            ) from exc
+
+    maker = scripts.ScriptMaker(None, bin_path)
+    maker.variants = {""}
+    maker.clobber = False
+
     for entry_point in _get_entry_points():
-        script_path = os.path.join(bin_path, entry_point["name"])
-        script = SCRIPT_TEMPLATE.format(python_path=python_path, **entry_point)
+        for filename in maker.make(entry_point):
+            logger.debug("wrote console_script '%s'", filename)
 
-        # If the script already exists (in any form), do not overwrite
-        # it nor recreate it in a new format.
-        suffixes = ("", ".exe", "-script.py", "-script.pyw")
-        if any(os.path.exists(f"{script_path}{s}") for s in suffixes):
-            continue
 
-        # FIXME: this is only correct for POSIX systems.  On Windows, the
-        # script source should be written to foo-script.py, and the py.exe
-        # launcher copied to foo.exe.  Unfortunately there is no guarantee that
-        # py.exe exists on the machine.  Creating the script like this is
-        # enough for msys and meson, both of which understand shebang lines.
-        # It does requires some care when invoking meson however, which is
-        # worked around in configure.  Note that a .exe launcher is needed
-        # and not for example a batch file, because the CreateProcess API
-        # (used by Ninja) cannot start them.
-        with open(script_path, "w", encoding="UTF-8") as file:
-            file.write(script)
-        mode = os.stat(script_path).st_mode | stat.S_IEXEC
-        os.chmod(script_path, mode)
-
-        logger.debug("wrote '%s'", script_path)
-
-
-def checkpip() -> None:
+def checkpip() -> bool:
     """
     Debian10 has a pip that's broken when used inside of a virtual environment.
 
@@ -553,12 +508,12 @@  def checkpip() -> None:
         import pip._internal  # type: ignore  # noqa: F401
 
         logger.debug("pip appears to be working correctly.")
-        return
+        return False
     except ModuleNotFoundError as exc:
         if exc.name == "pip._internal":
             # Uh, fair enough. They did say "internal".
             # Let's just assume it's fine.
-            return
+            return False
         logger.warning("pip appears to be malfunctioning: %s", str(exc))
 
     check_ensurepip("pip appears to be non-functional, and ")
@@ -570,6 +525,7 @@  def checkpip() -> None:
         check=True,
     )
     logger.debug("Pip is now (hopefully) repaired!")
+    return True
 
 
 def pkgname_from_depspec(dep_spec: str) -> str:
@@ -787,7 +743,12 @@  def post_venv_setup() -> None:
     """
     logger.debug("post_venv_setup()")
     # Test for a broken pip (Debian 10 or derivative?) and fix it if needed
-    checkpip()
+    if checkpip():
+        # We ran ensurepip. We need to re-run post_init...!
+        args = [sys.executable, __file__, "post_init"]
+        subprocess.run(args, check=True)
+        return
+
     # Finally, generate a 'pip' script so the venv is usable in a normal
     # way from the CLI. This only happens when we inherited pip from a
     # parent/system-site and haven't run ensurepip in some way.
diff --git a/python/setup.cfg b/python/setup.cfg
index 7d75c168a8..4ea016876b 100644
--- a/python/setup.cfg
+++ b/python/setup.cfg
@@ -111,6 +111,12 @@  ignore_missing_imports = True
 [mypy-pkg_resources]
 ignore_missing_imports = True
 
+[mypy-distlib]
+ignore_missing_imports = True
+
+[mypy-pip._vendor.distlib]
+ignore_missing_imports = True
+
 [pylint.messages control]
 # Disable the message, report, category or checker with the given id(s). You
 # can either give multiple identifiers separated by comma (,) or put this
@@ -123,7 +129,6 @@  ignore_missing_imports = True
 # --disable=W".
 disable=consider-using-f-string,
         consider-using-with,
-        fixme,
         too-many-arguments,
         too-many-function-args,  # mypy handles this with less false positives.
         too-many-instance-attributes,