-
Notifications
You must be signed in to change notification settings - Fork 257
Description
Feature Request
Is your feature request related to a problem? Please describe.
The mcp server's google drive tool doesn't synergy well with binary files such as PDFs, images, videos. When get_drive_file_content or get_doc_content encounters these file types, they return a generic placeholder message like "[Binary or unsupported text encoding for mimeType 'application/pdf' - X bytes]" instead of the actual file content.
The binary content is successfully downloaded via MediaIoBaseDownload but then discarded when UTF-8 decoding fails. The same limitation exists in get_doc_content for non-Google Docs files.
Describe the solution you'd like
I would like the binary content to be returned to clients (encoded as base64) so that AI agents can pass it to other specialized tools for processing. Specifically, modify get_drive_file_content and get_doc_content to return base64-encoded binary content when UTF-8 decoding fails:
except UnicodeDecodeError:
import base64
body_text = (
f"[Binary content - mimeType '{mime_type}' - {len(file_content_bytes)} bytes]\n"
f"Base64: {base64.b64encode(file_content_bytes).decode('ascii')}"
)This would enable clients to decode and process the binary data themselves using appropriate tools.
Describe alternatives you've considered
- Native PDF Processing: Implement a dedicated tool for PDF extraction using libraries like
PyPDF2orpdfplumber, following the same pattern asextract_office_xml_textwhich handles DOCX, XLSX, and PPTX files. However, this would require adding dependencies and maintaining PDF-specific logic. - Optional Parameter: Add a
return_binary_as_base64: bool = Falseparameter to both functions, allowing clients to opt-in to receiving binary content. This maintains backward compatibility but adds API complexity. - Separate Tool: Create a new MCP tool specifically for retrieving binary content, keeping the existing tools text-focused. This is cleaner architecturally but requires clients to know which tool to use for which file type.
Additional context
The codebase already has infrastructure for handling multiple file types through extract_office_xml_text, so this enhancement aligns with the existing pattern of supporting diverse formats. Both affected functions are registered as MCP tools and would need their return type documentation updated. The base64 encoding approach is the simplest to implement and most flexible, allowing clients to choose their own processing tools without adding dependencies to this codebase.