Minoshiro¶
Inspired by the original Reddit bot, Minoshiro is an async Python library that brings various web APIs for anime, manga, and light novel into one place.
Requirements¶
Requires Python3.6+
Features¶
- Simple and modern Pythonic API using
async/await
syntax - Fetches search results from up to 9 different websites
- Cached search results for faster access
- Integrates with existing databases
Contents¶
Install¶
To install the base version, simply install from PyPi:
pip install minoshiro
This library also comes with a PostgreSQL data controller, to use this you will need a PostgreSQL version 9.6 or above instance hosted.
To install with the built in PostgreSQL support, use:
pip install minoshiro[postgres]
or simply install asyncpg version 0.12.0 or later from PyPi
To achieve maximum speed with this library, uvloop is highly recommended.
Quickstart¶
This page gives a brief introduction to the library. It assumes you have the library installed, if you don’t check the Install portion.
The code block below showcases the basic usage using the built in support for PostgreSQL and SQLite3.
from pathlib import Path
from minoshiro import Medium, Minoshiro, Site
async def postgres():
postgres_config = {
"host": "localhost",
"port": "5432",
"user": "postgres",
"database": "postgres"
}
robo = await Minoshiro.from_postgres(
postgres_config, cache_pages=1
)
# get_data eagerly evaluates all the data and return them in a dict
saekano = await robo.get_data(
'Saenai Heroine no Sodatekata', Medium.ANIME
)
for site, data in saekano.items():
print(site, data)
# yield_data lazily evaluates the data and you can iterate over them
async for site, data in robo.yield_data('New Game!', Medium.MANGA):
print(site, data)
async def sqlite():
# from_sqlite method accepts both a string or a Pathlib Path object.
sqlite_path = 'path/to/sqlite/database'
another_path = Path('another/sqlite/path')
robo = await Minoshiro.from_sqlite(
sqlite_path
)
# We only want the results from those 2 sites.
sites = (Site.LNDB, Site.ANILIST)
async for site, data in robo.get_data('overlord', Medium.LN, sites):
print(site, data)
another_robo = await Minoshiro.from_sqlite(
mal_config, another_path
)
# Specify the seconds for HTTP request timeout.
print(
await another_robo.get_data(
'Love Live Sunshine!', Medium.ANIME, timeout=10
)
)
Logging¶
Minoshiro logs errors and debug information via the
logging Python module. The
library comes with a basic logger that prints to STDERR
. It is strongly
recommended you log to a log file in addition to printing to STDERR
.
Configuration of the logger can be as simple as:
from logging import FileHandler, Formatter
from minoshiro import get_default_logger
logger = get_default_logger()
file_handler = FileHandler(filename='my_log_file.log')
file_handler.setFormatter(
Formatter('%(asctime)s:%(levelname)s:%(name)s: %(message)s')
)
logger.addHandler(file_handler)
More advance setups are possible with the logging module. You can configure the logger to your liking as such:
from logging import getLogger
import minoshiro
my_logger = getLogger('minoshiro')
...
And finally, if you already have a logger set up in your application, you can simply use the existing logger instead of the one provided by the library.
Database¶
Minoshiro uses caching to make search results faster and more accurate.
Build in database support¶
Minoshiro comes with built in support for SQLite3 and PostgreSQL databases.
To use the built in SQLite3 support, simply use the from_sqlite
method as
such:
from minoshiro import Minoshiro
async def main():
db_path = 'path/to/database'
robo = await Minoshiro.from_sqlite(db_path)
To use the built in PostgreSQL support, you will need
asyncpg to be installed. Check
Install for more information.
Then, use the from_postgres
method as such:
from minoshiro import Minoshiro
async def main():
db_config = {
"host": "localhost",
"port": "5432",
"user": "postgres",
"database": "postgres"
}
robo = await Minoshiro.from_postgres(
db_config, schema='my_schema'
)
Extending DatabaseController¶
You may also write your custom implementation of the database controller if you
wish. To get started, inherit from the DataController
class as such:
from minoshiro import DataController
class MyDatabase(DataController):
def __init__(self, logger):
super().__init__(logger)
You will need to initialize the super class with a logger object.
Next, you will need to implement ALL of the following methods. The methods
MUST be defined with async def
.
@abstractmethod
async def get_identifier(self, query: str,
medium: Medium) -> Optional[Dict[Site, str]]:
"""
Get the identifier of a given search query.
:param query: the search query.
:type query: str
:param medium: the medium type.
:type medium: Medium
:return:
A dict of all identifiers for this search query for all sites,
None if nothing is found.
:rtype: Optional[Dict[Site, str]]
"""
raise NotImplementedError
@abstractmethod
async def set_identifier(self, name: str, medium: Medium,
site: Site, identifier: str):
"""
Set the identifier for a given name.
:param name: the name.
:type name: str
:param medium: the medium type.
:type medium: Medium
:param site: the site.
:type site: Site
:param identifier: the identifier.
:type identifier: str
"""
raise NotImplementedError
@abstractmethod
async def medium_data_by_id(self, id_: str, medium: Medium,
site: Site) -> Optional[dict]:
"""
Get data by id.
:param id_: the id.
:type id_: str
:param medium: the medium type.
:type medium: Medium
:param site: the site.
:type site: Site
:return: the data for that id if found.
:rtype: Optional[dict]
"""
raise NotImplementedError
@abstractmethod
async def set_medium_data(self, id_: str, medium: Medium,
site: Site, data: dict):
"""
Set the data for a given id.
:param id_: the id.
:type id_: str
:param medium: the medium type.
:type medium: Medium
:param site: the site.
:type site: Site
:param data: the data for the id.
:type data: dict
"""
raise NotImplementedError
API¶
The following section outlines the API of Minoshiro.
Default Logger¶
-
get_default_logger
()¶ Return a basic default
logging.Logger
Returns
A basic logger with a
logging.StreamHandler
attatched and with levelINFO
Minoshiro¶
-
class
Minoshiro
(db_controller, *, logger=None, loop=None)¶ Represents the search instance.
It is suggested to use one of the class methods to create the instance if you wish to use one of the data controllers provided by the library.
Make sure you run the
pre_cache()
method if you initialized the class directly from the__init__
method.Parameters
- db_controller(
DataController
) - Any sub class ofDataController
will work here. - logger(Optional[
logging.Logger
]) - The logger object. If it’s not provided, will use the defualt logger provided by the library. - loop(Optional[Event loop]) - An asyncio event loop. If not provided will use the default event loop.
-
classmethod
from_postgres
(db_config = None, pool=None, *, schema='minoshiro', cache_pages=0, logger=None, loop=None)¶ This method is a coroutine
Get an instance of
Minoshiro
withPostgresController
as the database controller.Parameters
db_config(
dict
) - A dict of database config for the connection. It should contain the keys in keyword arguments for theasyncpg.connection.connect
function.An example config might look like this:
db_config = { "host": 'localhost', "port": '5432', "user": 'postgres', "database": 'postgres' }
pool(Pool) - an existing
asyncpg
connection pool.One of
db_config
orpool
must not be None.schema(Optional[
str
]) - the name for the schema used. Defaults tominoshiro
cache_pages(Optional[
int
]) - The number of pages of anime and manga from Anilist to cache before the instance is created. Each page contains 40 entries max.logger(Optional[
logging.Logger
]) - The logger object. If it’s not provided, will use the defualt logger provided by the library.loop(Optional[Event loop]) - An asyncio event loop. If not provided will use the default event loop.
Returns
Instance of
Minoshiro
withPostgresController
as the database controller.
-
classmethod
from_sqlite
(path, *, cache_pages=0, logger=None, loop=None)¶ This method is a coroutine
Get an instance of
Minoshiro
withSqliteController
as the database controller.Parameters
- path(Union[
str
,pathlib.Path
]) - The path to the SQLite3 database, can either be a string or a Pathlib Path object. - cache_pages(Optional[
int
]) - The number of pages of anime and manga from Anilist to cache before the instance is created. Each page contains 40 entries max. - logger(Optional[
logging.Logger
]) - The logger object. If it’s not provided, will use the defualt logger provided by the library. - loop(Optional[Event loop]) - An asyncio event loop. If not provided will use the default event loop.
Returns
Instance of
Minoshiro
withPostgresController
as the database controller.- path(Union[
-
pre_cache
(cache_pages)¶ This method is a coroutine
Pre cache the database with anime and managa data.
This method is called by
from_postgres()
andfrom_sqlite()
, so you do not need to call this method if you created ths class instance with those two methods.Parameters
- cache_pages(
int
) - Number of Anilist pages to cache. There are 40 entries per page.
- cache_pages(
-
yield_data
(query, medium, sites, *, timeout=3)¶ This method is a coroutine
Yield the data for the search query from all sites.
Sites with no data found will be skipped.
Parameters
- query(
str
) - the search query - medium(
Medium
) - the medium type - sites(Optional[Iterable[
Site
]]) - an iterable of sites desired. If None is provided, will search all sites by default - timeout(Optional[
int
]) - The timeout in seconds for each HTTP request. Defualt is 3.
Returns
An asynchronous generator that yields the site and data in a tuple for all sites requested.
- query(
-
get_data
(query, medium, sites, *, timeout=3)¶ This method is a coroutine
Get the data for the search query in a dict.
Sites with no data found will not be in the return value.
Parameters
- query(
str
) - the search query - medium(
Medium
) - the medium type - sites(Optional[Iterable[
Site
]]) - an iterable of sites desired. If None is provided, will search all sites by default - timeout(Optional[
int
]) - The timeout in seconds for each HTTP request. Defualt is 3.
Returns
Data for all sites in a dict
{Site: data}
Note
When retrieving data from the result of this method, use the
dict.get()
method instead of square brackets.Example:
results = await search_instance.get_data( 'Non Non Biyori', Medium.ANIME ) # Good anilist = results.get(Site.ANILIST) # Bad, might raise KeyError anilist = results[Site.ANILIST]
- query(
- db_controller(
Enums¶
Minoshiro uses two enums to represent medium type and website.
-
class
Site
¶ -
MAL = 1
-
ANILIST = 2
-
ANIMEPLANET = 3
-
ANIDB = 4
-
KITSU = 5
-
MANGAUPDATES = 6
-
LNDB = 7
-
NOVELUPDATES = 8
-
VNDB = 9
-
-
class
Medium
¶ -
ANIME = 1
-
MANGA = 2
-
LN = 3
-
VN = 4
-
Database Controllers¶
-
class
DataController
(logger)¶ An ABC (abstract base class) that deals with database caching.
See Extending DatabaseController for details.
-
class
PostgresController
(pool, logger, schema='minoshiro')¶ To be able to integrate with an existing database, all tables for minoshiro will be put under the
minoshiro
schema unless a different schema name is passed to the __init__ method.Create the instance with the
get_instance()
method to make sure you have all the tables needed.-
classmethod
get_instance
(logger, connect_kwargs=None, pool=None, schema='minoshiro')¶ This method is a coroutine
Get a new instance of
PostgresController
This method will create the appropriate tables needed.
Parameters
logger(Optional[
logging.Logger
]) - The logger object. If it’s not provided, will use the defualt logger provided by the library.connect_kwargs(
dict
) - A dict of database config for the connection. It should contain the keys in keyword arguments for theasyncpg.connection.connect
function.An example config might look like this:
db_config = { "host": 'localhost', "port": '5432', "user": 'postgres', "database": 'postgres' }
pool(Pool) - an existing
asyncpg
connection pool.One of
db_config
orpool
must not be None.schema(
str
) - the name for the schema used. Defaults tominoshiro
Returns
a new instance of
PostgresController
-
classmethod
-
class
SqliteController
(path, logger, loop=None)¶ A SQLite3 data controller.
Create the instance with the
get_instance()
method to make sure you have all the tables needed.-
classmethod
get_instance
(path, logger=None, loop=None)¶ This method is a coroutine
Get a new instance of
SqliteController
This method will create the appropriate tables needed.
Parameters
- path(Union[
str
,pathlib.Path
]) - The path to the SQLite3 database, can either be a string or a Pathlib Path object. - logger(Optional[
logging.Logger
]) - The logger object. If it’s not provided, will use the defualt logger provided by the library. - loop(Optional[Event loop]) - An asyncio event loop. If not provided will use the default event loop.
Returns
A new instance of
SqliteController
- path(Union[
-
classmethod
License¶
Minoshiro is released under the MIT License. See LICENSE file for details.