# Testing Guide - Theory and Best Practices ## Table of Contents - [Introduction](#introduction) - [Your Current Approach vs. Test-Driven Development](#your-current-approach-vs-test-driven-development) - [The Testing Pyramid](#the-testing-pyramid) - [Key Concepts](#key-concepts) - [Real-World Testing Workflow](#real-world-testing-workflow) - [Regression Testing](#regression-testing---the-killer-feature) - [Code Coverage](#coverage---how-much-is-tested) - [Best Practices](#best-practices-summary) - [Practical Examples](#practical-example-your-workflow) - [When Should You Write Tests](#when-should-you-write-tests) - [Getting Started](#your-next-steps) --- ## Introduction This guide explains the theory and best practices of software testing, specifically for the PIF Compiler project. It moves beyond ad-hoc testing scripts to a comprehensive, automated testing approach. --- ## Your Current Approach vs. Test-Driven Development ### What You Do Now (Ad-hoc Scripts): ```python # test_script.py from cosing_service import cosing_search result = cosing_search("WATER", mode="name") print(result) # Look at output, check if it looks right ``` **Problems:** - ❌ Manual checking (is the output correct?) - ❌ Not repeatable (you forget what "correct" looks like) - ❌ Doesn't catch regressions (future changes break old code) - ❌ No documentation (what should the function do?) - ❌ Tedious for many functions --- ## The Testing Pyramid ``` /\ / \ E2E Tests (Few) /----\ / \ Integration Tests (Some) /--------\ / \ Unit Tests (Many) /____________\ ``` ### 1. **Unit Tests** (Bottom - Most Important) Test individual functions in isolation. **Example:** ```python def test_parse_cas_numbers_single(): """Test parsing a single CAS number.""" result = parse_cas_numbers(["7732-18-5"]) assert result == ["7732-18-5"] # ← Automated check ``` **Benefits:** - ✅ Fast (milliseconds) - ✅ No external dependencies (no API, no database) - ✅ Pinpoint exact problem - ✅ Run hundreds in seconds **When to use:** - Testing individual functions - Testing data parsing/validation - Testing business logic calculations --- ### 2. **Integration Tests** (Middle) Test multiple components working together. **Example:** ```python def test_full_cosing_workflow(): """Test search + clean workflow.""" raw = cosing_search("WATER", mode="name") clean = clean_cosing(raw) assert "cosingUrl" in clean ``` **Benefits:** - ✅ Tests real interactions - ✅ Catches integration bugs **Drawbacks:** - ⚠️ Slower (hits real APIs) - ⚠️ Requires internet/database **When to use:** - Testing workflows across multiple services - Testing API integrations - Testing database interactions --- ### 3. **E2E Tests** (End-to-End - Top - Fewest) Test entire application flow (UI → Backend → Database). **Example:** ```python def test_create_pif_from_ui(): """User creates PIF through Streamlit UI.""" # Click buttons, fill forms, verify PDF generated ``` **When to use:** - Testing complete user workflows - Smoke tests before deployment - Critical business processes --- ## Key Concepts ### 1. **Assertions - Automated Verification** **Old way (manual):** ```python result = parse_cas_numbers(["7732-18-5/56-81-5"]) print(result) # You look at: ['7732-18-5', '56-81-5'] # Is this right? Maybe? You forget in 2 weeks. ``` **Test way (automated):** ```python def test_parse_multiple_cas(): result = parse_cas_numbers(["7732-18-5/56-81-5"]) assert result == ["7732-18-5", "56-81-5"] # ← Computer checks! # If wrong, test FAILS immediately ``` **Common Assertions:** ```python # Equality assert result == expected # Truthiness assert result is not None assert "key" in result # Exceptions with pytest.raises(ValueError): invalid_function() # Approximate equality (for floats) assert result == pytest.approx(3.14159, rel=1e-5) ``` --- ### 2. **Mocking - Control External Dependencies** **Problem:** Testing `cosing_search()` hits the real COSING API: - ⚠️ Slow (network request) - ⚠️ Unreliable (API might be down) - ⚠️ Expensive (rate limits) - ⚠️ Hard to test errors (how do you make API return error?) **Solution: Mock it!** ```python from unittest.mock import Mock, patch @patch('cosing_service.req.post') # Replace real HTTP request def test_search_by_name(mock_post): # Control what the "API" returns mock_response = Mock() mock_response.json.return_value = { "results": [{"metadata": {"inciName": ["WATER"]}}] } mock_post.return_value = mock_response result = cosing_search("WATER", mode="name") assert result["inciName"] == ["WATER"] # ← Test your logic, not the API mock_post.assert_called_once() # Verify it was called ``` **Benefits:** - ✅ Fast (no real network) - ✅ Reliable (always works) - ✅ Can test error cases (mock API failures) - ✅ Isolate your code from external issues **What to mock:** - HTTP requests (`requests.get`, `requests.post`) - Database calls (`db.find_one`, `db.insert`) - File I/O (`open`, `read`, `write`) - External APIs (COSING, ECHA, PubChem) - Time-dependent functions (`datetime.now()`) --- ### 3. **Fixtures - Reusable Test Data** **Without fixtures (repetitive):** ```python def test_clean_basic(): data = {"inciName": ["WATER"], "casNo": ["7732-18-5"], ...} result = clean_cosing(data) assert ... def test_clean_empty(): data = {"inciName": ["WATER"], "casNo": ["7732-18-5"], ...} # Copy-paste! result = clean_cosing(data) assert ... ``` **With fixtures (DRY - Don't Repeat Yourself):** ```python # conftest.py @pytest.fixture def sample_cosing_response(): """Reusable COSING response data.""" return { "inciName": ["WATER"], "casNo": ["7732-18-5"], "substanceId": ["12345"] } # test file def test_clean_basic(sample_cosing_response): # Auto-injected! result = clean_cosing(sample_cosing_response) assert result["inciName"] == "WATER" def test_clean_empty(sample_cosing_response): # Reuse same data! result = clean_cosing(sample_cosing_response) assert "cosingUrl" in result ``` **Benefits:** - ✅ No code duplication - ✅ Centralized test data - ✅ Easy to update (change once, affects all tests) - ✅ Auto-cleanup (fixtures can tear down resources) **Common fixture patterns:** ```python # Database fixture with cleanup @pytest.fixture def test_db(): db = connect_to_test_db() yield db # Test runs here db.drop_all() # Cleanup after test # Temporary file fixture @pytest.fixture def temp_file(tmp_path): file_path = tmp_path / "test.json" file_path.write_text('{"test": "data"}') return file_path # Auto-cleaned by pytest ``` --- ## Real-World Testing Workflow ### Scenario: You Add a New Feature **Step 1: Write the test FIRST (TDD - Test-Driven Development):** ```python def test_parse_cas_removes_parentheses(): """CAS numbers with parentheses should be cleaned.""" result = parse_cas_numbers(["7732-18-5 (hydrate)"]) assert result == ["7732-18-5"] ``` **Step 2: Run test - it FAILS (expected!):** ```bash $ uv run pytest tests/test_cosing_service.py::test_parse_cas_removes_parentheses FAILED: AssertionError: assert ['7732-18-5 (hydrate)'] == ['7732-18-5'] ``` **Step 3: Write code to make it pass:** ```python def parse_cas_numbers(cas_string: list) -> list: cas_string = cas_string[0] cas_string = re.sub(r"\([^)]*\)", "", cas_string) # ← Add this # ... rest of function ``` **Step 4: Run test again - it PASSES:** ```bash $ uv run pytest tests/test_cosing_service.py::test_parse_cas_removes_parentheses PASSED ✓ ``` **Step 5: Refactor if needed - tests ensure you don't break anything!** --- ### TDD Cycle (Red-Green-Refactor) ``` 1. RED: Write failing test ↓ 2. GREEN: Write minimal code to pass ↓ 3. REFACTOR: Improve code without breaking tests ↓ Repeat ``` **Benefits:** - ✅ Forces you to think about requirements first - ✅ Prevents over-engineering - ✅ Built-in documentation (tests show intended behavior) - ✅ Confidence to refactor --- ## Regression Testing - The Killer Feature **Scenario: You change code 6 months later:** ```python # Original (working) def parse_cas_numbers(cas_string: list) -> list: cas_string = cas_string[0] cas_string = re.sub(r"\([^)]*\)", "", cas_string) cas_parts = re.split(r"[/;,]", cas_string) # Handles /, ;, , return [cas.strip() for cas in cas_parts] # You "improve" it def parse_cas_numbers(cas_string: list) -> list: return cas_string[0].split("/") # Simpler! But... ``` **Run tests:** ```bash $ uv run pytest FAILED: test_multiple_cas_with_semicolon Expected: ['7732-18-5', '56-81-5'] Got: ['7732-18-5;56-81-5'] # ← Oops, broke semicolon support! FAILED: test_cas_with_parentheses Expected: ['7732-18-5'] Got: ['7732-18-5 (hydrate)'] # ← Broke parentheses removal! ``` **Without tests:** - You deploy - Users report bugs - You're confused what broke - Spend hours debugging **With tests:** - Instant feedback - Fix before deploying - Save hours of debugging --- ## Coverage - How Much Is Tested? ### Running Coverage ```bash uv run pytest --cov=src/pif_compiler --cov-report=html ``` ### Sample Output ``` Name Stmts Miss Cover -------------------------------------------------- cosing_service.py 89 5 94% echa_service.py 156 89 43% models.py 45 45 0% -------------------------------------------------- TOTAL 290 139 52% ``` ### Interpretation - ✅ `cosing_service.py` - **94% covered** (great!) - ⚠️ `echa_service.py` - **43% covered** (needs more tests) - ❌ `models.py` - **0% covered** (no tests yet) ### Coverage Goals | Coverage | Status | Action | |----------|--------|--------| | 90-100% | ✅ Excellent | Maintain | | 70-90% | ⚠️ Good | Add edge cases | | 50-70% | ⚠️ Acceptable | Prioritize critical paths | | <50% | ❌ Poor | Add tests immediately | **Target:** 80%+ for business-critical code ### HTML Coverage Report ```bash uv run pytest --cov=src/pif_compiler --cov-report=html # Open htmlcov/index.html in browser ``` Shows: - Which lines are tested (green) - Which lines are not tested (red) - Which branches are not covered --- ## Best Practices Summary ### ✅ DO: 1. **Write tests for all business logic** ```python # YES: Test calculations def test_sed_calculation(): ingredient = Ingredient(quantity=10.0, dap=0.5) assert ingredient.calculate_sed() == 5.0 ``` 2. **Mock external dependencies** ```python # YES: Mock API calls @patch('cosing_service.req.post') def test_search(mock_post): mock_post.return_value.json.return_value = {...} ``` 3. **Test edge cases** ```python # YES: Test edge cases def test_parse_empty_cas(): assert parse_cas_numbers([""]) == [] def test_parse_invalid_cas(): with pytest.raises(ValueError): parse_cas_numbers(["abc-def-ghi"]) ``` 4. **Keep tests simple** ```python # YES: One test = one thing def test_cas_removes_whitespace(): assert parse_cas_numbers([" 123-45-6 "]) == ["123-45-6"] # NO: Testing multiple things def test_cas_everything(): assert parse_cas_numbers([" 123-45-6 "]) == ["123-45-6"] assert parse_cas_numbers(["123-45-6/789-01-2"]) == [...] # Too much in one test! ``` 5. **Run tests before committing** ```bash git add . uv run pytest # ← Always run first! git commit -m "Add feature X" ``` 6. **Use descriptive test names** ```python # YES: Describes what it tests def test_parse_cas_removes_parenthetical_info(): ... # NO: Vague def test_cas_1(): ... ``` --- ### ❌ DON'T: 1. **Don't test external libraries** ```python # NO: Testing if requests.post works def test_requests_library(): response = requests.post("https://example.com") assert response.status_code == 200 # YES: Test YOUR code that uses requests @patch('requests.post') def test_my_search_function(mock_post): ... ``` 2. **Don't make tests dependent on each other** ```python # NO: test_b depends on test_a def test_a_creates_data(): db.insert({"id": 1, "name": "test"}) def test_b_uses_data(): data = db.find_one({"id": 1}) # Breaks if test_a fails! # YES: Each test is independent def test_b_uses_data(): db.insert({"id": 1, "name": "test"}) # Create own data data = db.find_one({"id": 1}) ``` 3. **Don't test implementation details** ```python # NO: Testing internal variable names def test_internal_state(): obj = MyClass() assert obj._internal_var == "value" # Breaks with refactoring # YES: Test public behavior def test_public_api(): obj = MyClass() assert obj.get_value() == "value" ``` 4. **Don't skip tests** ```python # NO: Commenting out failing tests # def test_broken_feature(): # assert broken_function() == "expected" # YES: Fix the test or mark as TODO @pytest.mark.skip(reason="Feature not implemented yet") def test_future_feature(): ... ``` --- ## Practical Example: Your Workflow ### Before (Manual Script) ```python # test_water.py from cosing_service import cosing_search, clean_cosing result = cosing_search("WATER", "name") print(result) # ← You manually check clean = clean_cosing(result) print(clean) # ← You manually check again # Run 10 times with different inputs... tedious! ``` **Problems:** - Manual verification - Slow (type command, read output, verify) - Error-prone (miss things) - Not repeatable --- ### After (Automated Tests) ```python # tests/test_cosing_service.py def test_search_and_clean_water(): """Water should be searchable and cleanable.""" result = cosing_search("WATER", "name") assert result is not None assert "inciName" in result clean = clean_cosing(result) assert clean["inciName"] == "WATER" assert "cosingUrl" in clean # Run ONCE: pytest # It checks everything automatically! ``` **Run all 25 tests:** ```bash $ uv run pytest tests/test_cosing_service.py::TestParseCasNumbers::test_single_cas_number PASSED tests/test_cosing_service.py::TestParseCasNumbers::test_multiple_cas_with_slash PASSED ... ======================== 25 passed in 0.5s ======================== ``` **Benefits:** - ✅ All pass? Safe to deploy! - ❌ One fails? Fix before deploying! - ⏱️ 25 tests in 0.5 seconds vs. manual testing for 30 minutes --- ## When Should You Write Tests? ### Always Test: ✅ **Business logic** (calculations, data processing) ```python # YES def test_calculate_sed(): assert calculate_sed(quantity=10, dap=0.5) == 5.0 ``` ✅ **Data validation** (Pydantic models) ```python # YES def test_ingredient_validates_cas_format(): with pytest.raises(ValidationError): Ingredient(cas="invalid", quantity=10.0) ``` ✅ **API integrations** (with mocks) ```python # YES @patch('requests.post') def test_cosing_search(mock_post): ... ``` ✅ **Bug fixes** (write test first, then fix) ```python # YES def test_bug_123_empty_cas_crash(): """Regression test for bug #123.""" result = parse_cas_numbers([]) # Used to crash assert result == [] ``` --- ### Sometimes Test: ⚠️ **UI code** (harder to test, less critical) ```python # Streamlit UI tests are complex, lower priority ``` ⚠️ **Configuration** (usually simple) ```python # Config loading is straightforward, test if complex logic ``` --- ### Don't Test: ❌ **Third-party libraries** (they have their own tests) ```python # NO: Testing if pandas works def test_pandas_dataframe(): df = pd.DataFrame({"a": [1, 2, 3]}) assert len(df) == 3 # Pandas team already tested this! ``` ❌ **Trivial code** ```python # NO: Testing simple getters/setters class MyClass: def get_name(self): return self.name # Too simple to test ``` --- ## Your Next Steps ### 1. Install Pytest ```bash cd c:\Users\adish\Projects\pif_compiler uv add --dev pytest pytest-cov pytest-mock ``` ### 2. Run the COSING Tests ```bash # Run all tests uv run pytest # Run with verbose output uv run pytest -v # Run specific test file uv run pytest tests/test_cosing_service.py # Run specific test uv run pytest tests/test_cosing_service.py::TestParseCasNumbers::test_single_cas_number ``` ### 3. See Coverage ```bash # Terminal report uv run pytest --cov=src/pif_compiler/services/cosing_service # HTML report (more detailed) uv run pytest --cov=src/pif_compiler --cov-report=html # Open htmlcov/index.html in browser ``` ### 4. Start Writing Tests for New Code Follow the TDD cycle: 1. **Red**: Write failing test 2. **Green**: Write minimal code to pass 3. **Refactor**: Improve code 4. Repeat! --- ## Additional Resources ### Pytest Documentation - [Official Pytest Docs](https://docs.pytest.org/) - [Pytest Fixtures](https://docs.pytest.org/en/stable/fixture.html) - [Pytest Mocking](https://docs.pytest.org/en/stable/monkeypatch.html) ### Testing Philosophy - [Test-Driven Development (TDD)](https://www.freecodecamp.org/news/test-driven-development-what-it-is-and-what-it-is-not-41fa6bca02a2/) - [Testing Best Practices](https://testautomationuniversity.com/) - [The Testing Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html) ### PIF Compiler Specific - [tests/README.md](../tests/README.md) - Test suite documentation - [tests/RUN_TESTS.md](../tests/RUN_TESTS.md) - Quick start guide - [REFACTORING.md](../REFACTORING.md) - Code organization changes --- ## Summary **Testing transforms your development workflow:** | Without Tests | With Tests | |---------------|------------| | Manual verification | Automated checks | | Slow feedback | Instant feedback | | Fear of breaking things | Confidence to refactor | | Undocumented behavior | Tests as documentation | | Debug for hours | Pinpoint issues immediately | **Start small:** 1. Write tests for one service (✅ COSING done!) 2. Add tests for new features 3. Fix bugs with tests first 4. Gradually increase coverage **The investment pays off:** - Fewer bugs in production - Faster development (less debugging) - Better code design - Easier collaboration - Peace of mind 😌 --- *Last updated: 2025-01-04*