EDIT: With thanks to u/[email protected], I have a solution for cleaning up the one table that is growing ridiculously in size and that is the activity table. A TRUNCATE activity; command cleaned that right up. A word of caution is in order because the Lemmy instance must be shutdown completely and only the postgres container running if you’re using docker.

As a new Lemmy admin, I think I really need to learn PostgreSQL administration because I need a way to keep my database from ballooning in size and there are no management tools built yet for it. Would someone be so kind as to recommend a good website for learning PostgreSQL? This would also help me maintain my mastodon instance.

    • code@lemmy.mayes.io
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      Look at vacumn command. Also dbeaver is a teally good gui tool. Right now there is one table thats the main cause (i think its called activity but dont quote me on that there was a post about it)

      That table is mainly for debugging as it logs every action in activitypub your instance does. I truncate it once a week and vacumn it (shrinks the disk space used). Caution. You must shut down lemmy before doing so. I run containers so i stop all except postgres and clean it up.

        • PenguinCoder@beehaw.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Really ugly, but really works. Connect to psql and run:

          WITH RECURSIVE pg_inherit(inhrelid, inhparent) AS
              (select inhrelid, inhparent
              FROM pg_inherits
              UNION
              SELECT child.inhrelid, parent.inhparent
              FROM pg_inherit child, pg_inherits parent
              WHERE child.inhparent = parent.inhrelid),
          pg_inherit_short AS (SELECT * FROM pg_inherit WHERE inhparent NOT IN (SELECT inhrelid FROM pg_inherit))
          SELECT table_schema
              , TABLE_NAME
              , row_estimate
              , pg_size_pretty(total_bytes) AS total
              , pg_size_pretty(index_bytes) AS INDEX
              , pg_size_pretty(toast_bytes) AS toast
              , pg_size_pretty(table_bytes) AS TABLE
              , total_bytes::float8 / sum(total_bytes) OVER () AS total_size_share
            FROM (
              SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes
              FROM (
                   SELECT c.oid
                        , nspname AS table_schema
                        , relname AS TABLE_NAME
                        , SUM(c.reltuples) OVER (partition BY parent) AS row_estimate
                        , SUM(pg_total_relation_size(c.oid)) OVER (partition BY parent) AS total_bytes
                        , SUM(pg_indexes_size(c.oid)) OVER (partition BY parent) AS index_bytes
                        , SUM(pg_total_relation_size(reltoastrelid)) OVER (partition BY parent) AS toast_bytes
                        , parent
                    FROM (
                          SELECT pg_class.oid
                              , reltuples
                              , relname
                              , relnamespace
                              , pg_class.reltoastrelid
                              , COALESCE(inhparent, pg_class.oid) parent
                          FROM pg_class
                              LEFT JOIN pg_inherit_short ON inhrelid = oid
                          WHERE relkind IN ('r', 'p')
                       ) c
                       LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
            ) a
            WHERE oid = parent
          ) a
          ORDER BY total_bytes DESC LIMIT 3;
          

          That will show the top 3 database tables sizes. I bet you number one is activity.

      • PenguinCoder@beehaw.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        You really don’t want to do a vacuum full on the live DB, it will lock everything. While useful and necessary I think, you have got to plan down time for it. Or your site is going to be inaccessible anyway.