Commit graph

230 commits

Author SHA1 Message Date
Kevin Decherf
44e63667d9 updateOriginUrl: add comment blocks for the parse_url diff check
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-24 22:13:03 +02:00
Kevin Decherf
5ba5e22a09 updateOriginUrl: rewrite some if, resolving feedbacks from PR
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-24 21:54:09 +02:00
Kevin Decherf
b49c87acf1 ignoreOriginUrl: add initial support of ignore lists
Add the ability to specify hosts and patterns lists to ignore the given
entry url and replace it with the fetched content url without touching
to origin_url.

This initial support should be reworked in the following months to move
the hardcoded ignore lists in the database.

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-22 23:42:09 +02:00
Kevin Decherf
fc040c749d updateOriginUrl: add behavior when diff is fragment and query
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-22 23:08:58 +02:00
Kevin Decherf
e07fadea76 Refactor updateOriginUrl to include new behaviors behaviors
- Leave origin_url unchanged if difference is an ending slash
- Leave origin_url unchanged if difference is scheme
- Ignore (noop) if difference is query string or fragment

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-22 23:01:16 +02:00
Kevin Decherf
781864b954 ContentProxy: swap entry url to origin_url and set new url according to graby content
Closes #3529

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-21 16:15:31 +02:00
Kevin Decherf
4a81360efc ContentProxy: fix a corner case when entry.url is empty in updateEntry
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-10-21 16:13:20 +02:00
Tobi823
83f1c3274f Run php-cs-fixer for fixing coding standard issues 2018-09-23 22:20:43 +02:00
Tobi823
7a65c2017b Override the value of the given parameter ($title) with the (hopefully)
correct (to UTF-8) converted PDF title
2018-09-21 13:23:39 +02:00
Tobi823
c01d953292 Add tests for logic
Try to translate the title of a PDF from UTF-8 (then UTF-16BE, then WINDOWS-1252) to UTF-8
2018-09-21 13:15:00 +02:00
Tobi823
f80f16dfc8 Try to detect the character encoding in PDFs and try to translate
the title from the PDF to UTF-8
2018-09-21 13:15:00 +02:00
Tobi823
8648f0c005 Remove type declaration for PHP 5 compatibility 2018-09-21 13:15:00 +02:00
Tobi823
d76a5a6d60 Bugfix: Sanitize the title of a saved webpage from invalid UTF-8 characters 2018-09-21 13:15:00 +02:00
Kevin Decherf
2a1ceb67b4 php-cs-fixer
Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2018-09-05 14:25:32 +02:00
Simounet
e6f12c0734 More robust srcset image attribute handling
Linked to HTMLawed PR https://github.com/kesar/HTMLawed/pull/17
2018-07-12 14:29:30 +02:00
Simounet
3fbbe0d9f1 Fix image downloading on null image path 2018-07-05 11:40:51 +02:00
Simounet
c15bb5ad72 Fix srcset attribute on images downloaded 2018-06-01 13:49:16 +02:00
Kevin Decherf
af29e1bf07 Fix empty title and domain_name when exception is thrown during fetch
Add a new helper to set a default title when it's empty:
1/ use basename part of entry's path, if any
2/ or use domain name

Fixes #2053

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2017-12-13 22:44:31 +01:00
Jeremy Benoist
709e21a3f4
Define storeArticleHeaders false by default
Fix tests which must use `$storeArticleHeaders`.
Fix CS
2017-11-21 10:37:36 +01:00
Nicolas Lœuillet
8a21985474 Added internal setting to enable/disable headers storage 2017-11-20 18:47:48 +01:00
Jeremy Benoist
15a6402f75
Properly run php-cs-fixer 2017-10-28 20:16:43 +02:00
Martin Trigaux
385e651684 php-cs-fixer
php bin/php-cs-fixer fix src/Wallabag/CoreBundle/Helper/EntriesExport.php
2017-10-28 17:17:22 +02:00
Martin Trigaux
c779373f2c Set the title in a separated chapter
Set the export option on the same page, same as done in producePdf
Move the ToC at the end of the book so the title page is the first one
2017-10-28 14:49:14 +02:00
Martin Trigaux
a6e9ad0b7d add a title page
The first page of the book is the title
2017-10-28 10:45:37 +02:00
Jeremy Benoist
9dd67fa342
CS 2017-10-11 10:43:36 +02:00
Nicolas Lœuillet
8f187e280f
Fixed @j0k3r's review 2017-10-11 10:43:19 +02:00
Nicolas Lœuillet
dc7fa8dfc6
Fixed @tcitworld's review 2017-10-11 10:43:19 +02:00
Nicolas Lœuillet
b1428a1cf8
Translated first page of exported article 2017-10-11 10:43:19 +02:00
Jeremy Benoist
3ef055ced3
CS 2017-10-09 16:47:15 +02:00
Nicolas Lœuillet
78b36d4dbe Merge pull request #3332 from nclsHart/better-txt-export
Better entry txt export using html2text
2017-09-06 15:08:12 +02:00
Kevin Decherf
7036d91fe7 Tag: render tags case-insensitive by storing them in lowercase
Fixes #2502

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2017-08-27 16:51:23 +02:00
Nicolas Hart
c660878388 better entry txt export using html2text 2017-08-27 00:04:21 +02:00
Nicolas Hart
52b84c11a5 Fix some namespaces and phpdoc 2017-07-29 22:51:50 +02:00
Jeremy Benoist
ff9f89fd23
Add a test for updatePublishedAt
To avoid error when a content is re-submitted and it previously add a
published date.

Also, fix the `testPostSameEntry`
2017-07-24 17:07:47 +02:00
Simounet
b236d3f627
Fix updatePublishedAt on already parsed article's date 2017-07-24 16:39:07 +02:00
Jérémy Benoist
f39152ad6e Merge pull request #3266 from egilli/export-domain-as-author
Use the article publisher as author for exported files
2017-07-11 09:21:49 +02:00
Étienne Gilli
eeabca8090 Make updateAuthor code simpler to read 2017-07-10 10:08:20 +02:00
Étienne Gilli
c57f69d967 Use the article publisher as author for export
When exporting an entry, use the publishedBy field as author name for
epub, mobi and pdf formats. Fallback to domain name if empty.
2017-07-09 18:33:14 +02:00
Étienne Gilli
07320a2bd2 Use the article domain as author for export files
When exporting an entry, use the domain name as author name for epub,
mobi and pdf formats, instead of 'wallabag'.
Change the author from array to string, because for now, there is always
only one author.
2017-07-08 19:53:43 +02:00
Jeremy Benoist
927c9e796f
Add EntityTimestampsTrait to handle dates
Refactorize timestamps() method to avoid re-writing it on each entity
2017-07-06 09:01:51 +02:00
Jeremy Benoist
c18a2476b6
CS 2017-07-03 13:56:39 +02:00
Jeremy Benoist
d0ec2ddd23
Fix validateAndSetPreviewPicture
Which wasn't covered by a test!
2017-07-03 13:45:04 +02:00
Jeremy Benoist
a05b61159e
Fix PATCH method
The PATCH method for the entry should only update what user sent to us and not the whole entry as it was before.
Also, sending tags when patching an entry will now remove all current tags & assocatied new ones.
2017-07-03 13:45:04 +02:00
Jeremy Benoist
f808b01692
Add a real configuration for CS-Fixer 2017-07-01 09:52:38 +02:00
Jeremy Benoist
18c38dffc6
Add RSS tags feeds 2017-06-21 11:44:35 +02:00
Jérémy Benoist
80784b782b Merge pull request #2683 from wallabag/credentials-in-db
Store credentials in DB
2017-06-20 16:40:48 +02:00
Thomas Citharel
bead8b42da
Fix reviews
Encrypt username too
Redirect to list after saving credentials
Fix typos

Signed-off-by: Thomas Citharel <tcit@tcit.fr>
2017-06-20 16:03:39 +02:00
Jeremy Benoist
906424c1b6
Crypt site credential password 2017-06-20 16:03:35 +02:00
Thomas Citharel
41d45c6122 Fix empty language and preview pics 2017-06-12 16:46:33 +02:00
Jeremy Benoist
80e49ba7b0
Convert - to _ in language
Mostly to increase language supports
2017-06-09 11:42:09 +02:00
Jeremy Benoist
42f3bb2c63
Use Locale instead of Language 2017-06-09 11:28:04 +02:00
Jeremy Benoist
be54dfe4e6
CS 2017-06-08 21:56:20 +02:00
Jeremy Benoist
0d349ea670
Validate language & preview picture fields
Instead of saving the value of each field right into the content without any validation, it seems better to validate them.
This might sounds obvious now we say that.
2017-06-08 21:51:46 +02:00
Jérémy Benoist
c0d756f67d Merge pull request #3181 from wallabag/api-content-patch
Add ability to patch an entry with more fields
2017-06-07 15:40:59 +02:00
Jeremy Benoist
577c0b6dd8
Use an alternative way to detect image
When parsing content to retrieve images to save locally, we only check for the content-type of the image response.
In some case, that value is empty.
Now we’re also checking for the first few bytes of the content as an alternative to detect if it’s an image wallabag can handle.
We might get higher image supports using that alternative method.
2017-06-05 22:54:02 +02:00
Jeremy Benoist
645291e8fe
Add ability to patch an entry with more fields
Like when we create an entry, we can now patch an entry with new fields:
- content
- language
- preview_picture
- published_at
- authors
2017-06-02 20:52:49 +02:00
Jérémy Benoist
a687c8d915 Merge pull request #2708 from jcharaoui/import-disablecontentupdate
Import disableContentUpdate
2017-06-02 11:26:37 +02:00
Jeremy Benoist
9bf7752f73
CS 2017-06-01 22:58:38 +02:00
Jeremy Benoist
fcad69a427
Replace images with &
Images with `&` in the path weren’t well replaced because they might be with `&amp;` in the html instead.

Replacing `&` with `&amp;` fix the problem.
2017-06-01 22:50:33 +02:00
Jeremy Benoist
ec97072152
No need to catch that Exception 2017-06-01 11:45:02 +02:00
Jeremy Benoist
6acadf8e98
Rewrote code & fix tests 2017-06-01 11:31:45 +02:00
Jeremy Benoist
843182c7cf
CS 2017-06-01 09:52:09 +02:00
Jeremy Benoist
d5c2cc54b5
Fix tests 2017-06-01 09:49:15 +02:00
Jerome Charaoui
d0e9b3d640
Add disableContentUpdate import option
This commit also decouples the "import" and "update" functions inside
ContentProxy. If a content array is available, it must be passed to the
new importEntry method.
2017-06-01 09:48:14 +02:00
Jerome Charaoui
7aba665e48
Avoid returning objects passed by reference.
Objects are always passed by reference, so it doesn't make sense to
return an object which is passed by reference as it will always be the
same object. This change makes the code a bit more readable.
2017-06-01 09:43:01 +02:00
Jeremy Benoist
53da8ad844
Page parameter was never used in the function
It could have been used if we set the current page inside PreparePagerForEntries.
But we did that in each controller because we can have an OutOfRangeCurrentPageException
2017-06-01 09:29:18 +02:00
Jeremy Benoist
f0378b4d7c
Forced date can now be a timestamp too
Add adding more tests for forced content
2017-05-31 14:00:15 +02:00
Jeremy Benoist
9e349f08a6
Improve docs 2017-05-31 14:00:15 +02:00
Jeremy Benoist
0d6cfb884c
Remove htmlawed and use graby instead
Instead of using htmlawed (which is already used in graby) use graby directly (which require some refacto on graby side).
Still needs some tests.
2017-05-31 14:00:15 +02:00
Jeremy Benoist
74a75f7d43
Use graby ContentExtractor to clean html
It might be better to re-use some graby functionalities to clean html instead of building a new system.
2017-05-31 14:00:15 +02:00
Jeremy Benoist
e668a8124c
Allow other fields to be send using API
Entry API can now have these new fields:
- content
- language
- preview_picture
- published_at

Re-use the ContentProxy to be able to do the same using the web UI (in the future).
htmLawed is used to clean stuff from content, I hope it’ll be enough to avoid security breach.

Lower content validation when we want to update an entry with content already defined. Before, language & content_type were required. If there weren’t provided, we re-fetched the content using graby. I think these fields aren’t required for an entry to be created. So I removed them.
Which means some import from the v1 export won’t be re-fetched since they provide content, url & title.

Also, remove liberation link from Readability import to avoid overlaping import (from wallabag v1, which had the same link)
2017-05-31 13:59:45 +02:00
Nicolas Lœuillet
d61fd8be4f Merge pull request #3138 from Kdecherf/2835-tags
Ignore ActionMarkAsRead when removing tag from entry
2017-05-31 11:48:42 +02:00
Kevin Decherf
5dbf3f2326 TagController: ignore ActionMarkAsRead when removing tag from entry
Fixes #2835

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2017-05-31 00:36:46 +02:00
Jeremy Benoist
5fe65baee5
Fix some Scrutinizer issues 2017-05-30 11:39:15 +02:00
Thomas Citharel
5d3deafd3e CS
Signed-off-by: Thomas Citharel <tcit@tcit.fr>
2017-05-28 01:16:01 +02:00
Thomas Citharel
6bc6fb1f60 Move Tags assigner to a separate file
Signed-off-by: Thomas Citharel <tcit@tcit.fr>
2017-05-27 22:08:14 +02:00
Nicolas Lœuillet
0a033767db
Added logger when we match Tagging rules 2017-05-12 13:13:19 +02:00
Nicolas Lœuillet
dda6a6addc
Added headers field in Entry 2017-05-11 14:18:21 +02:00
Jeremy Benoist
94b232bbb8
Skip auth when no credentials are found
If we can’t find a credential for the current host, even if it required login, we won’t add them and website will be fetched without any login.
2017-05-09 22:53:42 +02:00
Jeremy Benoist
d047530dc0
CS 2017-05-09 11:17:09 +02:00
Bertrand Dunogier
5b914b0422 Improved Guzzle subscribers extensibility
Allows 3rd parties to register new guzzle subscribers by adding extra calls to the http_client_factory service.
2017-05-04 21:44:34 +02:00
Nicolas Lœuillet
64f1d8f77a Merge pull request #3024 from wallabag/store-date
Added publication date and author
2017-04-18 13:12:28 +02:00
Nicolas Lœuillet
7b0b3622ab Added author of article 2017-04-09 15:24:51 +02:00
Nicolas Lœuillet
5e9009ce86 Added publication date 2017-04-05 22:22:52 +02:00
Martin Trigaux
1b70990b01 Add export notice at the end of the epub
The text "Produced by wallabag with PHPePub" is the first page of any epub.

On ebooks reader, it is common (e.g. kobo) to use the first page as the cover of
unread books, which makes it more difficult to differentiate the books.

Move the Notices chapter at the end of the book.
2017-04-05 09:24:48 +02:00
Kevin Decherf
7a3260ae9e Save alpha channel when downloading PNG images
Fixes #2805

Signed-off-by: Kevin Decherf <kevin@kdecherf.com>
2017-03-29 21:58:29 +02:00
Jeremy Benoist
7bf6b555f5
Log restricted access value
It might help to figure out if we have enabled it or not
2017-02-13 14:20:03 +01:00
Thomas Citharel
8303b037fb add cli export
Signed-off-by: Thomas Citharel <tcit@tcit.fr>
2017-01-22 12:51:14 +01:00
Jeremy Benoist
3d71d40349
Avoid false preview image
If the website doesn't provide an og_image, the value will be false and so it'll be saved like that in the database.
We prefer to leave it as null instead of false.
2017-01-10 17:42:36 +01:00
Nicolas Lœuillet
e044d27f82
Replaced chmod for download pictures feature 2016-12-08 13:04:15 +01:00
Jeremy Benoist
106bdbcd0a Add some comments 2016-12-04 11:27:49 +01:00
Jerome Charaoui
e858018fdd Prevent undefined index when import fetching fails 2016-12-02 22:45:04 -05:00
Jerome Charaoui
36e6ef52a1 Imported entries which fail to fetch get standard error body 2016-12-02 22:42:36 -05:00
Jerome Charaoui
29dca43236 Retain imported content if fetching fails, fixes #2658 2016-12-02 22:41:35 -05:00
Nicolas Lœuillet
d51093a7d9
Added documentation and missing translations 2016-11-22 17:32:24 +01:00
Nicolas Lœuillet
d64bf7953b
Added internal setting to enable/disable articles with paywall 2016-11-22 14:56:53 +01:00
Nicolas Lœuillet
40f3ea57fb
Cleared CookieJar to avoid websites who use cookies for analytics 2016-11-22 14:25:51 +01:00
Bertrand Dunogier
7aab0ecf2f Added authentication for restricted access articles
Fix #438. Thank you so much @bdunogier
2016-11-22 14:01:46 +01:00
Jeremy Benoist
68003139e1
Merge remote-tracking branch 'origin/master' into 2.2
# Conflicts:
#	.editorconfig
#	docs/de/index.rst
#	docs/de/user/import.rst
#	docs/en/index.rst
#	docs/en/user/configuration.rst
#	docs/en/user/import.rst
#	docs/fr/index.rst
#	docs/fr/user/import.rst
#	src/Wallabag/CoreBundle/Command/InstallCommand.php
#	src/Wallabag/CoreBundle/Resources/translations/messages.da.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.de.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.en.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.es.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.fa.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.fr.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.it.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.oc.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.pl.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.pt.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.ro.yml
#	src/Wallabag/CoreBundle/Resources/translations/messages.tr.yml
#	src/Wallabag/CoreBundle/Resources/views/themes/baggy/Config/index.html.twig
#	web/bundles/wallabagcore/themes/baggy/css/style.min.css
#	web/bundles/wallabagcore/themes/baggy/js/baggy.min.js
#	web/bundles/wallabagcore/themes/material/css/style.min.css
#	web/bundles/wallabagcore/themes/material/js/material.min.js
2016-11-19 15:30:49 +01:00
Nicolas Lœuillet
10b3509757
Added http_status in Entry entity 2016-11-18 15:09:21 +01:00