Cleanup avatars cache based on last-access-date (#291)
wojciech.kapcia@tigase.net opened 4 years ago

Currently there is no cleanup mechanism and the cache can grow significantly.

Andrzej Wójcik (Tigase) commented 4 years ago

This should be solved on redesigned accounts list.

Andrzej Wójcik (Tigase) commented 4 years ago

I've tried to apply this logic, however it was not usable. For some reason, last access date is not updated on macOS. Due to that I've lost most of the cached avatars.

I've considers manually updating last access date (once a day) but it will not give us good results, because when we should update this value?

  1. when we want to present the avatar? - for sure
  2. when we received presence with photo hash? - may be (we fetch unknown avatars at this time so it would be reasonable)

However, even if we would agree to update in both cases, we could still drop some "currently unused" avatars, ie. vcard avatar which is not advertised (client does not sent photo hash) and client currently has a different avatar published with PEP.

Looking at the amount of data stored, I would prefer to "scale down" those avatars and keep them all instead of removing oldest once - refetching is more problematic and we do not have a huge benefit from this cleanup.

wojciech.kapcia@tigase.net commented 4 years ago

I've tried to apply this logic, however it was not usable. For some reason, last access date is not updated on macOS. Due to that I've lost most of the cached avatars.

Weird, it seems that in my case it's updated:

wojtek@atlantiscity.local ~/dev $ ls -lahut --time=atime /Users/wojtek/Library/Containers/org.tigase.messenger.BeagleIM/Data/Library/Caches/org.tigase.messenger.BeagleIM/avatars
total 29M
drwxr-xr-x 1681 wojtek staff  53K Oct 23 12:51 .
-rw-r--r--    1 wojtek staff  34K Oct 23 12:46 a2f7903c6278123c33faeae60b759e838488a5c6
-rw-r--r--    1 wojtek staff 6.8K Oct 23 12:46 949b1dae59ae4676c7bc816344d8057497189109
-rw-r--r--    1 wojtek staff  11K Oct 23 12:46 aad7923d4f5b1758d7e57fc77f214b18c0ed05ef
-rw-r--r--    1 wojtek staff 6.8K Oct 23 12:46 3c940aa684e8c82fd3ec867bdbca55e04700449a
-rw-r--r--    1 wojtek staff  38K Oct 23 12:46 2bb9f3ac07df8f41e8e2749d01ad97bec357f7ed
-rw-r--r--    1 wojtek staff  24K Oct 22 19:49 924c38eb0f757a21d91ef27583aeb6142d7a0d5d
-rw-r--r--    1 wojtek staff 6.8K Oct 22 16:50 134cb8e80c6ccb24c0fc869c030757c6f81b49a1
-rw-r--r--    1 wojtek staff  11K Oct 22 14:58 7b5c5e9387517c6ddab02c8ee562829868f8a411
-rw-r--r--    1 wojtek staff 6.7K Oct 22 13:02 f9d051d7b77d8259802065a0b3fe886be7ddf29f
-rw-r--r--    1 wojtek staff  82K Oct 22 12:21 e0ea317a372e672987aa20e159902f40601adc56
-rw-r--r--    1 wojtek staff 9.8K Oct 21 20:28 8d14487d3a193319cf5ebf2f0022834d3046aaa0
-rw-r--r--    1 wojtek staff  38K Oct 21 13:56 92ae17ca4166a564034b686e4a5a64967ea0970e
-rw-r--r--    1 wojtek staff 6.5K Oct 20 17:10 77f743e8c762e432efb042e349f880cbaeb85b7a
-rw-r--r--    1 wojtek staff 6.6K Oct 20 16:23 8cc46523ea310525e3846fe59e93ccfc32f6071b
-rw-r--r--    1 wojtek staff  25K Oct 20 12:31 0243d8c26bf7eb5622cc25b83a50bc3b25b8c353
-rw-r--r--    1 wojtek staff 6.6K Oct 20 12:21 fbdd98d0f1a126c3383eeef01c48748e71f6e999
-rw-r--r--    1 wojtek staff 7.8K Oct 20 12:20 3dc7064e5be357052e4c1ab3d0854dd102f093a6
-rw-r--r--    1 wojtek staff  58K Oct 19 18:31 b14cfa36d9348857183d5017d1108d20d969d095
-rw-r--r--    1 wojtek staff 6.7K Oct 19 12:44 b33945081883f15e355060ea9262712f7e869051
-rw-r--r--    1 wojtek staff 6.7K Oct 19 12:44 71fdbf83240d5facd9476ae8bcb15a3f9723ee55
-rw-r--r--    1 wojtek staff 6.7K Oct 16 19:14 352521466dd724cfda37e0b8e73177f3be26cbef
-rw-r--r--    1 wojtek staff 8.0K Oct 16 18:25 e2120cdfc44135713e568c2709fbaf1467bedb56
-rw-r--r--    1 wojtek staff 7.1K Oct 16 15:00 59e5815e7a35e1b14950dc2176d51932d1e0e989
-rw-r--r--    1 wojtek staff 6.8K Oct 16 12:10 25431167d627dc420104a8e5bfcf550b3200a04a
-rw-r--r--    1 wojtek staff  18K Oct 16 12:10 afc54de2e441e025cfb409e4c3d96e57814f5b9a
-rw-r--r--    1 wojtek staff 6.8K Oct 15 19:04 e2afcb4ef1677ae7841e2346c96e89d32aee0d88
-rw-r--r--    1 wojtek staff  59K Oct 15 17:31 d348023afe71fca992ffd5cd9c41e53500709b61
-rw-r--r--    1 wojtek staff  18K Oct 14 18:52 0f82b89d3ee812d3a710b8d58f1809e0cf5c2b5a
-rw-r--r--    1 wojtek staff 5.0K Oct 14 17:06 4f6127ade44dce5ced0d5aa77285002d433e2620
-rw-r--r--    1 wojtek staff 8.5K Oct 14 12:31 71e2a395353b305a797cfb38831efa76d754d325
-rw-r--r--    1 wojtek staff 3.6K Oct 14 12:31 57cef62c2ba41e58939a081cb8a4730e0a7048e5
-rw-r--r--    1 wojtek staff  20K Oct 14 08:46 42c9d75f59312dffbc079a357231e862d2ae5dfc
-rw-r--r--    1 wojtek staff 8.8K Oct 14 08:45 ffb5f36ede4c6211e83c766c40c77107f0d2dec6
-rw-r--r--    1 wojtek staff  25K Oct 13 18:00 fb9fe4d6c0082d385d071f93d6d237e5ea562b00
-rw-r--r--    1 wojtek staff  12K Oct 13 12:36 c5dd8fd92c5863172e8dddc1f0335d70ffa525a9
-rw-r--r--    1 wojtek staff 9.7K Oct 13 12:08 078b1af380d4eddbbce16505ebacff686999a933
-rw-r--r--    1 wojtek staff 6.8K Oct 13 12:08 476371caee1c96b4a0f41065713856bf49c444be
-rw-r--r--    1 wojtek staff 6.7K Oct 12 12:54 7aba9c584d8e4a4a2e2e80cba54772da8e97dcdc
-rw-r--r--    1 wojtek staff 2.5K Oct 12 12:41 2aba8ebd43a6ee14297c8bb93701051d5751c639
-rw-r--r--    1 wojtek staff 6.8K Oct 12 12:39 79a54bc32f1bf8f7dcc4a057cfa7bbb3d2f358c4
-rw-r--r--    1 wojtek staff 6.4K Oct 12 12:39 4cf2fbc3ee733935972ab7bc8bd0a61f9e0b5ed5
-rw-r--r--    1 wojtek staff 6.7K Oct 12 12:39 7fecd2fb56b36721ea138c586d88a40b511fc188
…
-rw-r--r--    1 wojtek staff 5.9K Oct  9  2019 b0451cda15a3ad5de94e28952ab6d1aace42cbfe
-rw-r--r--    1 wojtek staff 9.2K Oct  9  2019 5dd09ad7e32a917f4331df2d3a1df4fda8f32a96
-rw-r--r--    1 wojtek staff 3.8K Oct  9  2019 d6f3bdf122b5f08c53b61ef99499b9437f85ea63
-rw-r--r--    1 wojtek staff 1.9K Oct  9  2019 4f8e90b7f5cfcb9f1eb55e0e9e4ade73880172c7
-rw-r--r--    1 wojtek staff 1.1K Oct  8  2019 71e3a24b8eef286b6b95fc097cfafac11770ed01
-rw-r--r--    1 wojtek staff  79K Oct  8  2019 8971bebb412241ef185e855e7a664ad7773d1263
-rw-r--r--    1 wojtek staff 157K Oct  8  2019 6eb5c1bbf073891b246d04bcc2712c068afddf54
-rw-r--r--    1 wojtek staff  36K Oct  3  2019 fe1f6a583dba315308475af7f329a576e1b6cdaf
-rw-r--r--    1 wojtek staff  17K Oct  3  2019 42bd4ea318feb829074b842f40b172ff14fcefe2
-rw-r--r--    1 wojtek staff 6.6K Oct  3  2019 0c517b985d89d308798b891068dbc0d10d882863
-rw-r--r--    1 wojtek staff  21K Oct  1  2019 07ec6b040d4621a75c2e118e1bb8b3f6a4ab32f5
-rw-r--r--    1 wojtek staff 388K Sep 13  2019 68cd31d385cf25aca7df5234a48b2585dbad1ed3
-rw-r--r--    1 wojtek staff 209K Jun 18  2019 2o0llk5a4b8xbh8wix0

and I suggested it because of that.

Regarding manual updating - I think that the OS updates this value when you read it (so your (1) case). IMHO keeping avatars that weren't displayed in more than couple of months doesn't make all that much sense, but this is huge IMHO. (Though, I'd be rather reluctant to scaling them down - in my case there are 1679 avatars totalling to ~25M which is not that much nowadays, size-wise)

Andrzej Wójcik (Tigase) commented 4 years ago

@wojtek I've checked and confirmed that this value was not updated in my case - timestamp matched file creation date. As for size, I've also concluded that nowadays it does not matter that much (as most of them are very small files).

As for removal of "unused" avatars. It is possible that we do have 2 avatars for each JID (1 PEP and 1 VCard). If they do not match, then we 2 avatars. In the UI of BeagleIM, I'm using "the best one" (PEP if available, if not then VCard). Now it may happen that PEP will be "removed" by the user and we need to fallback to VCard and that is OK (as most likely we do have a VCard stored in the database). However, if we remove VCard avatar (ie. due to the fact that it was not used as user used PEP) and we receive presence with photo hash there is no easy way to check if the hash matches or if we need to fetch a new vcard. Right now I need to just check if file matching hash exists. Without that I would need to have a separate table "joined" with vcard to keep hashes of all photos embedded in those vcards (or fetch vcard, decode data and calculate the hash). In both cases, app would have a lot more to do than just check if file exists and benefit of this change is small if any.

wojciech.kapcia@tigase.net commented 4 years ago

In both cases, app would have a lot more to do than just check if file exists and benefit of this change is small if any.

+1

issue 1 of 1
Type
Task
Priority
Minor
Assignee
Spent time
2h
Issue Votes (0)
Watchers (0)
Reference
tigase/_clients/beagle-im#291
Please wait...
Page is in error, reload to recover