Re: [PATCH v4] gen_compile_commands.py: fix path resolve with symlinks in it

From: Masahiro Yamada
Date: Sun Dec 10 2023 - 01:05:17 EST


On Wed, Dec 6, 2023 at 10:26 AM Jialu Xu <xujialu@xxxxxxxxx> wrote:
>
> When a path contains relative symbolic links, os.path.abspath() might
> not follow the symlinks and instead return the absolute path with just
> the relative paths resolved, resulting in an incorrect path.
>
> 1. Say "drivers/hdf/" has some symlinks:
>
> # ls -l drivers/hdf/
> total 364
> drwxrwxr-x 2 ... 4096 ... evdev
> lrwxrwxrwx 1 ... 44 ... framework -> ../../../../../../drivers/hdf_core/framework
> -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
> lrwxrwxrwx 1 ... 55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
> lrwxrwxrwx 1 ... 53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
> -rw-r--r-- 1 ... 74 ... Makefile
> drwxrwxr-x 3 ... 4096 ... wifi
>
> 2. One .cmd file records that:
>
> # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
> cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
> /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c
>
> 3. os.path.abspath returns "/path/to/out/framework/core/manager/src/devmgr_service.c", not correct:
>
> # ./scripts/clang-tools/gen_compile_commands.py
> INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
> /path/to/out/framework/core/manager/src/devmgr_service.c not found
>
> Use pathlib.Path.resolve(), which resolves the symlinks and normalizes
> the paths correctly.
>
> # cat compile_commands.json
> ...
> {
> "command": ...
> "directory": ...
> "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
> },
> ...
>
> Reviewed-by: Nathan Chancellor <nathan@xxxxxxxxxx>
> Signed-off-by: Jialu Xu <xujialu@xxxxxxxxx>
> ---
> scripts/clang-tools/gen_compile_commands.py | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> index 180952fb91c1b..99e28b7152c19 100755
> --- a/scripts/clang-tools/gen_compile_commands.py
> +++ b/scripts/clang-tools/gen_compile_commands.py
> @@ -11,6 +11,7 @@ import argparse
> import json
> import logging
> import os
> +from pathlib import Path
> import re
> import subprocess
> import sys
> @@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file_path):
> # by Make, so this code replaces the escaped version with '#'.
> prefix = command_prefix.replace('\#', '#').replace('$(pound)', '#')
>
> - # Use os.path.abspath() to normalize the path resolving '.' and '..' .
> - abs_path = os.path.abspath(os.path.join(root_directory, file_path))
> + # Make the path absolute, resolving all symlinks on the way and also normalizing it.
> + # Convert Path object to a string because 'PosixPath' is not JSON serializable.
> + abs_path = str(Path(root_directory, file_path).resolve())
> if not os.path.exists(abs_path):
> raise ValueError('File %s not found' % abs_path)
> return {





Is there any reason why you didn't simply replace
os.path.abspath() with os.path.realpath() ?



This patch uses pathlib.Path() just in one place,
leaving many call-sites of os.path.*() functions.

If it is just a matter of your preference,
you need to convert os.path.*() for consistency
(as a follow-up patch).





I see one more os.path.abspath()



return (args.log_level,
os.path.abspath(args.directory),
args.output,
args.ar,
args.paths if len(args.paths) > 0 else [args.directory])




Does it cause a similar issue for the 'directory' field
with symbolic link jungles?







--
Best Regards
Masahiro Yamada