[Feat request] Google Drive tool : Enable Binary Content Passthrough for Unsupported File Types (PDFs, Images, etc.)

## Feature Request

**Is your feature request related to a problem? Please describe.**

The mcp server's google drive tool doesn't synergy well with binary files such as PDFs, images, videos. When `get_drive_file_content` or `get_doc_content` encounters these file types, they return a generic placeholder message like `"[Binary or unsupported text encoding for mimeType 'application/pdf' - X bytes]"` instead of the actual file content. 

The binary content is successfully downloaded via `MediaIoBaseDownload` but then discarded when UTF-8 decoding fails. The same limitation exists in `get_doc_content` for non-Google Docs files.

**Describe the solution you'd like**

I would like the binary content to be returned to clients (encoded as base64) so that AI agents can pass it to other specialized tools for processing. Specifically, modify `get_drive_file_content` and `get_doc_content` to return base64-encoded binary content when UTF-8 decoding fails:

```python
except UnicodeDecodeError:
    import base64
    body_text = (
        f"[Binary content - mimeType '{mime_type}' - {len(file_content_bytes)} bytes]\n"
        f"Base64: {base64.b64encode(file_content_bytes).decode('ascii')}"
    )
```

This would enable clients to decode and process the binary data themselves using appropriate tools.

**Describe alternatives you've considered**

1. **Native PDF Processing**: Implement a dedicated tool for PDF extraction using libraries like `PyPDF2` or `pdfplumber`, following the same pattern as `extract_office_xml_text` which handles DOCX, XLSX, and PPTX files. However, this would require adding dependencies and maintaining PDF-specific logic.
2. **Optional Parameter**: Add a `return_binary_as_base64: bool = False` parameter to both functions, allowing clients to opt-in to receiving binary content. This maintains backward compatibility but adds API complexity.
3. **Separate Tool**: Create a new MCP tool specifically for retrieving binary content, keeping the existing tools text-focused. This is cleaner architecturally but requires clients to know which tool to use for which file type.

**Additional context**

The codebase already has infrastructure for handling multiple file types through `extract_office_xml_text`, so this enhancement aligns with the existing pattern of supporting diverse formats. Both affected functions are registered as MCP tools and would need their return type documentation updated. The base64 encoding approach is the simplest to implement and most flexible, allowing clients to choose their own processing tools without adding dependencies to this codebase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat request] Google Drive tool : Enable Binary Content Passthrough for Unsupported File Types (PDFs, Images, etc.) #263

Feature Request

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feat request] Google Drive tool : Enable Binary Content Passthrough for Unsupported File Types (PDFs, Images, etc.) #263

Description

Feature Request

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions