fix: handle HTML-only emails with useless text/plain fallback #247
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Gmail API returns multipart/alternative emails with both text/plain and text/html parts. Some email senders (e.g., Rentalcars.com, Booking.com) include only a useless fallback message in text/plain like:
While all actual content is in the HTML part (often 1000x+ longer).
The current implementation always prioritizes text/plain over text/html, causing these emails to return only the fallback message.
Root Cause
_format_body_content()returned text/plain immediately if it was non-empty, without checking if it was useful content or a fallback placeholder.Solution
<!--) in text/plainChanges
pyproject.toml: Addedbeautifulsoup4>=4.12.0dependencygmail/gmail_tools.py:_html_to_text()function to convert HTML to readable text_format_body_content()with fallback detectionTesting
Tested with Rentalcars.com confirmation emails:
Impact
Note on uv.lock
The
uv.lockfile was intentionally not included in this PR to avoid format conflicts. The dependency changes inpyproject.tomlare sufficient, and the maintainers can regenerate the lock file with their version ofuv.🤖 Generated with Claude Code