woob.browser.filters.file

class MimeType(selector=None, symbols='', replace=[], children=True, newlines=True, transliterate=False, normalize='NFC', **kwargs)[source]

Bases: CleanText

A filter to determine the MIME Type (Multipurpose Internet Mail Extensions) of a file based on a given string, which can be a file path or a file name with an extension.

Parameters:

default (Any, optional) – The default MIME type to be returned when the file type is not recognized.

filter(txt)[source]

Get the MIME type from a file name or path.

Parameters:

txt (str) – The file name or path for which to determine the MIME type.

Raises:

FormatError – If the MIME type is not recognized.

>>> MimeType().filter('foo.pdf')
'application/pdf'
>>> MimeType().filter('path/foo/invoices.tar.gz')
:rtype: :py:data:`~typing.Any`
'application/x-tar'
>>> MimeType(default='NAN').filter('foo.no')
'NAN'
class FileExtension(selector=None, validate_mime=False, default=NO_DEFAULT)[source]

Bases: CleanText

A filter to extract the file extension from a given string representing a file name or a file path.

Parameters:
  • default (Any, optional) – The default extension to be returned when the file extension is not recognized. (default: NO_DEFAULT)

  • validate_mime (bool, optional) – Flag to indicate whether to validate the MIME type of the returned extension. (default: False)

filter(txt)[source]

Get the file extension from a file name or path.

Parameters:

txt (str) – The file name or path for which to extract the file extension.

Raises:

FormatError – If the file extension is not recognized.

>>> FileExtension().filter('file.docx')
'docx'
>>> FileExtension().filter('path/to/file.tar.gz')
'tar.gz'
>>> FileExtension(default='NAN').filter('file_without_extension')
'NAN'
>>> FileExtension().filter('/home/user/Documents/report.pdf')
'pdf'
>>> FileExtension(default='UNKNOWN').filter('spreadsheet')
'UNKNOWN'
>>> FileExtension(default='UNKNOWN', validate_mime=True).filter('path/to/file.dfs')
:rtype: :py:data:`~typing.Any`
'UNKNOWN'
>>> FileExtension(default='UNKNOWN', validate_mime=True).filter('file.jpg')
'jpg'