python

package
v1.1.14 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 26, 2025 License: MIT Imports: 8 Imported by: 0

README

Python Code Analyzer

Analizor de cod Python pentru extragerea simbolurilor, structurii și relațiilor din fișiere Python. Indexează codul pentru căutare semantică în Qdrant.

Status: ✅ FULLY IMPLEMENTED


🎯 Ce Face Acest Analizor?

Analizorul Python parsează fișierele .py și extrage:

  1. Simboluri - clase, metode, funcții, variabile, constante
  2. Relații - moșteniri, dependențe, apeluri de metode
  3. Metadate - decoratori, type hints, docstrings

Informațiile sunt convertite în CodeChunk-uri care sunt apoi indexate în Qdrant pentru căutare semantică.


📊 Fluxul de Date

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Fișiere .py    │────▶│  Python Analyzer │────▶│   CodeChunks    │
│  (cod sursă)    │     │  (regex parsing) │     │   (structurat)  │
└─────────────────┘     └──────────────────┘     └────────┬────────┘
                                                          │
                                                          ▼
                                                 ┌─────────────────┐
                                                 │     Qdrant      │
                                                 │  (vector store) │
                                                 └─────────────────┘

🔍 Ce Indexăm

1. Clase (type: "class")
@dataclass
class User(BaseModel, LoggingMixin, metaclass=ABCMeta):
    """Reprezintă un utilizator în sistem."""
    name: str
    email: str

Informații extrase:

Câmp Valoare Descriere
name "User" Numele clasei
bases ["BaseModel", "LoggingMixin"] Clasele părinte (moștenire)
decorators ["dataclass"] Decoratorii aplicați
is_abstract true Dacă e clasă abstractă (ABC)
is_dataclass true Dacă e decorată cu @dataclass
is_enum false Dacă moștenește din Enum
is_protocol false Dacă e Protocol (typing)
is_mixin true Dacă e/folosește mixin
metaclass "ABCMeta" Metaclasa specificată
dependencies ["BaseModel", "LoggingMixin"] Toate dependențele clasei
docstring "Reprezintă un utilizator..." Documentația clasei
2. Metode (type: "method")
class UserService:
    async def get_user(self, user_id: int) -> User:
        """Returnează un utilizator după ID."""
        self.validate_id(user_id)
        user = await self.repository.find(user_id)
        return user

Informații extrase:

Câmp Valoare Descriere
name "get_user" Numele metodei
signature "async def get_user(self, user_id: int) -> User" Semnătura completă
class_name "UserService" Clasa părinte
parameters [{name: "user_id", type: "int"}] Parametrii cu tipuri
return_type "User" Tipul returnat
is_async true Dacă e metodă async
is_static false Dacă e @staticmethod
is_classmethod false Dacă e @classmethod
calls [{name: "validate_id", receiver: "self"}, ...] Metodele apelate
type_deps ["User"] Tipurile folosite (dependențe)
docstring "Returnează un utilizator..." Documentația metodei
3. Funcții (type: "function")
@lru_cache(maxsize=100)
async def fetch_data(url: str) -> dict:
    """Descarcă date de la URL."""
    yield from process(url)

Informații extrase:

Câmp Valoare Descriere
name "fetch_data" Numele funcției
signature "async def fetch_data(url: str) -> dict" Semnătura
is_async true Dacă e async
is_generator true Dacă folosește yield
decorators ["lru_cache"] Decoratorii aplicați
4. Proprietăți (type: "property")
class User:
    @property
    def full_name(self) -> str:
        return f"{self.first_name} {self.last_name}"
    
    @full_name.setter
    def full_name(self, value: str):
        self.first_name, self.last_name = value.split()

Informații extrase:

Câmp Valoare Descriere
name "full_name" Numele proprietății
type "str" Tipul returnat
has_getter true Are getter (@property)
has_setter true Are setter (@x.setter)
has_deleter false Are deleter (@x.deleter)
5. Constante (type: "const")
MAX_CONNECTIONS: int = 100
API_BASE_URL = "https://api.example.com"

Informații extrase:

  • Detectate prin convenția UPPER_CASE
  • Tipul și valoarea sunt extrase
6. Variabile (type: "var")
logger = logging.getLogger(__name__)
default_config: Config = Config()

🔗 Detectarea Relațiilor

Dependency Graph

Analizorul construiește un graf de dependențe între clase:

class OrderService:
    repository: OrderRepository  # → dependency
    
    def create_order(self, user: User) -> Order:  # → dependencies: User, Order
        notification = NotificationService()  # → dependency (din calls)
        return Order(...)

Dependențe detectate:

  • OrderRepository - din type hint pe variabilă
  • User - din parametru
  • Order - din return type
  • NotificationService - din apeluri de metode
Method Call Analysis
def process(self, data):
    self.validate(data)           # → self.validate
    result = Helper.compute(data) # → Helper.compute (static call)
    super().process(data)         # → super().process
    save_to_db(result)            # → save_to_db (function call)

Apeluri detectate:

{
  "calls": [
    {"name": "validate", "receiver": "self", "line": 2},
    {"name": "compute", "receiver": "Helper", "class_name": "Helper", "line": 3},
    {"name": "process", "receiver": "super()", "line": 4},
    {"name": "save_to_db", "line": 5}
  ]
}

🏗️ Structura Fișierelor

python/
├── types.go           # Tipuri: ModuleInfo, ClassInfo, MethodInfo, MethodCall, etc.
├── analyzer.go        # Implementare PathAnalyzer (1500+ linii)
├── api_analyzer.go    # Legacy APIAnalyzer (build-tagged out)
├── analyzer_test.go   # 26 teste comprehensive
└── README.md          # Această documentație

💻 Utilizare

Analiză Standard
import "github.com/doITmagic/rag-code-mcp/internal/ragcode/analyzers/python"

// Creare analizor (exclude test files by default)
analyzer := python.NewCodeAnalyzer()

// Analiză directoare/fișiere
chunks, err := analyzer.AnalyzePaths([]string{"./myproject"})

for _, chunk := range chunks {
    fmt.Printf("[%s] %s.%s\n", chunk.Type, chunk.Package, chunk.Name)
    fmt.Printf("  Dependencies: %v\n", chunk.Metadata["dependencies"])
}
Cu Opțiuni
// Include și fișierele de test
analyzer := python.NewCodeAnalyzerWithOptions(true)

🔌 Integrare

Language Manager

Analizorul Python este selectat automat pentru:

  • python, py - proiecte Python generice
  • django - proiecte Django
  • flask - proiecte Flask
  • fastapi - proiecte FastAPI
Detectare Workspace

Proiectele Python sunt detectate prin:

Fișier Descriere
pyproject.toml PEP 518 - Python modern
setup.py Setuptools legacy
requirements.txt Dependențe pip
Pipfile Pipenv

📋 Tipuri de CodeChunk

Type Descriere Exemplu
class Definiție clasă class User(BaseModel):
method Metodă de clasă def get_user(self):
function Funcție module-level def helper():
property Proprietate @property @property def name(self):
const Constantă UPPER_CASE MAX_SIZE = 100
var Variabilă module-level logger = getLogger()

🏷️ Metadate Complete

Class Metadata
{
  "bases": ["BaseModel", "Mixin"],
  "decorators": ["dataclass"],
  "is_abstract": false,
  "is_dataclass": true,
  "is_enum": false,
  "is_protocol": false,
  "is_mixin": false,
  "metaclass": "",
  "dependencies": ["BaseModel", "Mixin", "User", "Order"]
}
Method Metadata
{
  "class_name": "UserService",
  "is_static": false,
  "is_classmethod": false,
  "is_async": true,
  "is_abstract": false,
  "decorators": ["cache"],
  "calls": [
    {"name": "validate", "receiver": "self", "line": 10},
    {"name": "save", "receiver": "self.repository", "line": 12}
  ],
  "type_deps": ["User", "Order"]
}
Function Metadata
{
  "is_async": true,
  "is_generator": false,
  "decorators": ["lru_cache"]
}

🧪 Testare

# Rulează toate testele (26 teste)
go test ./internal/ragcode/analyzers/python/

# Cu output verbose
go test -v ./internal/ragcode/analyzers/python/

# Test specific
go test -v -run TestMethodCallExtraction ./internal/ragcode/analyzers/python/

# Cu coverage
go test -cover ./internal/ragcode/analyzers/python/

🚫 Căi Excluse

Analizorul sare automat:

  • __pycache__/ - cache Python
  • .venv/, venv/, env/ - virtual environments
  • .git/ - Git
  • .tox/, .pytest_cache/, .mypy_cache/ - cache-uri
  • dist/, build/ - distribuții
  • test_*.py, *_test.py - fișiere test (implicit)

⚠️ Limitări

Limitare Descriere
Regex-based Nu folosește AST Python complet - poate rata cazuri edge
No Type Resolution Type hints sunt extrase ca stringuri, nu rezolvate
Single-file Fiecare fișier e analizat independent
No Runtime Info Nu execută codul, doar analiză statică

🔮 Îmbunătățiri Viitoare

  • Django: modele, views, URLs, forms
  • Flask/FastAPI: route detection, dependency injection
  • Type resolution: rezolvare type hints cross-file
  • Import graph: graf complet de importuri
  • Nested classes: clase definite în alte clase
  • Comprehensions: list/dict/set comprehensions

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ClassInfo

type ClassInfo struct {
	Name         string         `json:"name"`
	Description  string         `json:"description"` // Class docstring
	Bases        []string       `json:"bases,omitempty"`
	Decorators   []string       `json:"decorators,omitempty"`
	Methods      []MethodInfo   `json:"methods"`
	Properties   []PropertyInfo `json:"properties"`
	ClassVars    []VariableInfo `json:"class_vars,omitempty"`
	IsAbstract   bool           `json:"is_abstract"`
	IsDataclass  bool           `json:"is_dataclass"`
	IsEnum       bool           `json:"is_enum"`                // Inherits from Enum
	IsProtocol   bool           `json:"is_protocol"`            // Inherits from Protocol (typing)
	IsMixin      bool           `json:"is_mixin"`               // Class name ends with Mixin or used as mixin
	Metaclass    string         `json:"metaclass,omitempty"`    // metaclass= argument
	Dependencies []string       `json:"dependencies,omitempty"` // Classes this class depends on (via type hints, imports)
	FilePath     string         `json:"file_path,omitempty"`
	StartLine    int            `json:"start_line,omitempty"`
	EndLine      int            `json:"end_line,omitempty"`
	Code         string         `json:"code,omitempty"`
}

ClassInfo describes a Python class

type CodeAnalyzer

type CodeAnalyzer struct {
	// contains filtered or unexported fields
}

CodeAnalyzer implements PathAnalyzer for Python

func NewCodeAnalyzer

func NewCodeAnalyzer() *CodeAnalyzer

NewCodeAnalyzer creates a new Python code analyzer

func NewCodeAnalyzerWithOptions

func NewCodeAnalyzerWithOptions(includeTests bool) *CodeAnalyzer

NewCodeAnalyzerWithOptions creates a Python code analyzer with options

func (*CodeAnalyzer) AnalyzeFile

func (ca *CodeAnalyzer) AnalyzeFile(filePath string) ([]codetypes.CodeChunk, error)

AnalyzeFile analyzes a single Python file

func (*CodeAnalyzer) AnalyzePaths

func (ca *CodeAnalyzer) AnalyzePaths(paths []string) ([]codetypes.CodeChunk, error)

AnalyzePaths implements the PathAnalyzer interface

func (*CodeAnalyzer) GetModules

func (ca *CodeAnalyzer) GetModules() []*ModuleInfo

GetModules returns the internal module information

type ConstantInfo

type ConstantInfo struct {
	Name        string `json:"name"`
	Type        string `json:"type,omitempty"`
	Value       string `json:"value"`
	Description string `json:"description"`
	FilePath    string `json:"file_path,omitempty"`
	StartLine   int    `json:"start_line,omitempty"`
	EndLine     int    `json:"end_line,omitempty"`
}

ConstantInfo describes a module-level constant (UPPER_CASE)

type DependencyInfo

type DependencyInfo struct {
	Source     string   `json:"source"`     // Source class/module
	Target     string   `json:"target"`     // Target class/module
	Type       string   `json:"type"`       // "inheritance", "composition", "import", "type_hint"
	References []string `json:"references"` // Specific references (method names, etc.)
}

DependencyInfo represents a dependency relationship between classes/modules

type DocstringArg

type DocstringArg struct {
	Name        string `json:"name"`
	Type        string `json:"type,omitempty"`
	Description string `json:"description"`
	Default     string `json:"default,omitempty"`
	Optional    bool   `json:"optional,omitempty"`
}

DocstringArg represents a parameter/attribute in docstring

type DocstringInfo

type DocstringInfo struct {
	Summary     string           `json:"summary"`
	Description string           `json:"description"`
	Args        []DocstringArg   `json:"args,omitempty"`
	Returns     *DocstringReturn `json:"returns,omitempty"`
	Raises      []DocstringRaise `json:"raises,omitempty"`
	Examples    []string         `json:"examples,omitempty"`
	Attributes  []DocstringArg   `json:"attributes,omitempty"`
}

DocstringInfo contains parsed docstring information

type DocstringRaise

type DocstringRaise struct {
	Type        string `json:"type"`
	Description string `json:"description"`
}

DocstringRaise represents an exception that can be raised

type DocstringReturn

type DocstringReturn struct {
	Type        string `json:"type,omitempty"`
	Description string `json:"description"`
}

DocstringReturn represents return value documentation

type FunctionInfo

type FunctionInfo struct {
	Name        string                 `json:"name"`
	Signature   string                 `json:"signature"`
	Description string                 `json:"description"` // Function docstring
	Parameters  []codetypes.ParamInfo  `json:"parameters"`
	ReturnType  string                 `json:"return_type,omitempty"`
	Returns     []codetypes.ReturnInfo `json:"returns,omitempty"`
	Decorators  []string               `json:"decorators,omitempty"`
	IsAsync     bool                   `json:"is_async"`
	IsGenerator bool                   `json:"is_generator"`
	FilePath    string                 `json:"file_path,omitempty"`
	StartLine   int                    `json:"start_line,omitempty"`
	EndLine     int                    `json:"end_line,omitempty"`
	Code        string                 `json:"code,omitempty"`
}

FunctionInfo describes a module-level function

type ImportInfo

type ImportInfo struct {
	Module    string   `json:"module"`          // Module being imported
	Names     []string `json:"names,omitempty"` // Specific names imported (from X import a, b)
	Alias     string   `json:"alias,omitempty"` // Import alias (import X as Y)
	IsFrom    bool     `json:"is_from"`         // True if "from X import Y"
	StartLine int      `json:"start_line,omitempty"`
}

ImportInfo describes an import statement

type MethodCall

type MethodCall struct {
	Name      string `json:"name"`                 // Method/function name
	Receiver  string `json:"receiver,omitempty"`   // Object the method is called on (e.g., "self", "cls", variable name)
	ClassName string `json:"class_name,omitempty"` // Class name if known
	Line      int    `json:"line,omitempty"`       // Line number of the call
}

MethodCall represents a call to another method/function

type MethodInfo

type MethodInfo struct {
	Name          string                 `json:"name"`
	Signature     string                 `json:"signature"`
	Description   string                 `json:"description"` // Method docstring
	Parameters    []codetypes.ParamInfo  `json:"parameters"`
	ReturnType    string                 `json:"return_type,omitempty"`
	Returns       []codetypes.ReturnInfo `json:"returns,omitempty"`
	Decorators    []string               `json:"decorators,omitempty"`
	Calls         []MethodCall           `json:"calls,omitempty"`     // Methods/functions this method calls
	TypeDeps      []string               `json:"type_deps,omitempty"` // Types used in parameters/return
	IsStatic      bool                   `json:"is_static"`
	IsClassMethod bool                   `json:"is_classmethod"`
	IsProperty    bool                   `json:"is_property"`
	IsAbstract    bool                   `json:"is_abstract"`
	IsAsync       bool                   `json:"is_async"`
	ClassName     string                 `json:"class_name,omitempty"`
	FilePath      string                 `json:"file_path,omitempty"`
	StartLine     int                    `json:"start_line,omitempty"`
	EndLine       int                    `json:"end_line,omitempty"`
	Code          string                 `json:"code,omitempty"`
}

MethodInfo describes a class method

type ModuleDependencies

type ModuleDependencies struct {
	ModuleName   string           `json:"module_name"`
	Imports      []ImportInfo     `json:"imports"`
	Dependencies []DependencyInfo `json:"dependencies"`
}

ModuleDependencies contains all dependency information for a module

type ModuleInfo

type ModuleInfo struct {
	Name        string         `json:"name"`        // Module name (e.g., "mypackage.mymodule")
	Path        string         `json:"path"`        // File path
	Description string         `json:"description"` // Module docstring
	Classes     []ClassInfo    `json:"classes"`
	Functions   []FunctionInfo `json:"functions"`
	Constants   []ConstantInfo `json:"constants"`
	Variables   []VariableInfo `json:"variables"`
	Imports     []ImportInfo   `json:"imports"`
}

ModuleInfo contains comprehensive information about a Python module/package

type PropertyInfo

type PropertyInfo struct {
	Name        string `json:"name"`
	Type        string `json:"type,omitempty"` // Type hint if available
	Description string `json:"description"`
	HasGetter   bool   `json:"has_getter"`
	HasSetter   bool   `json:"has_setter"`
	HasDeleter  bool   `json:"has_deleter"`
	FilePath    string `json:"file_path,omitempty"`
	StartLine   int    `json:"start_line,omitempty"`
	EndLine     int    `json:"end_line,omitempty"`
}

PropertyInfo describes a class property (using @property decorator)

type VariableInfo

type VariableInfo struct {
	Name        string `json:"name"`
	Type        string `json:"type,omitempty"` // Type annotation if available
	Value       string `json:"value,omitempty"`
	Description string `json:"description"`
	IsConstant  bool   `json:"is_constant"` // UPPER_CASE naming convention
	FilePath    string `json:"file_path,omitempty"`
	StartLine   int    `json:"start_line,omitempty"`
	EndLine     int    `json:"end_line,omitempty"`
}

VariableInfo describes a module-level or class variable

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL