Add the ability to specify hosts and patterns lists to ignore the given
entry url and replace it with the fetched content url without touching
to origin_url.
This initial support should be reworked in the following months to move
the hardcoded ignore lists in the database.
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
- Leave origin_url unchanged if difference is an ending slash
- Leave origin_url unchanged if difference is scheme
- Ignore (noop) if difference is query string or fragment
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
Add a new helper to set a default title when it's empty:
1/ use basename part of entry's path, if any
2/ or use domain name
Fixes#2053
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
The PATCH method for the entry should only update what user sent to us and not the whole entry as it was before.
Also, sending tags when patching an entry will now remove all current tags & assocatied new ones.
Instead of saving the value of each field right into the content without any validation, it seems better to validate them.
This might sounds obvious now we say that.
This commit also decouples the "import" and "update" functions inside
ContentProxy. If a content array is available, it must be passed to the
new importEntry method.
Objects are always passed by reference, so it doesn't make sense to
return an object which is passed by reference as it will always be the
same object. This change makes the code a bit more readable.
Entry API can now have these new fields:
- content
- language
- preview_picture
- published_at
Re-use the ContentProxy to be able to do the same using the web UI (in the future).
htmLawed is used to clean stuff from content, I hope it’ll be enough to avoid security breach.
Lower content validation when we want to update an entry with content already defined. Before, language & content_type were required. If there weren’t provided, we re-fetched the content using graby. I think these fields aren’t required for an entry to be created. So I removed them.
Which means some import from the v1 export won’t be re-fetched since they provide content, url & title.
Also, remove liberation link from Readability import to avoid overlaping import (from wallabag v1, which had the same link)
If the website doesn't provide an og_image, the value will be false and so it'll be saved like that in the database.
We prefer to leave it as null instead of false.