Create build for the package on tc server. (#1037)

Andrzej Wójcik (Tigase) commented 6 years ago

Most of the projects created (Tigase XMPP Server related) and are building fine. We still need to find a way to deploy documentation generated by projects.

@kobit @wojtek Should I look into this as well? I mean to deploy documentation to S3 services? or for now, it should be copied on build.tigase.net as it was before?

Artur Hefczyc commented 6 years ago

I would like to get rid of build.tigase.net as soon as possible. It generates unnecessary costs. So, yes, we have to find another way. S3 is definitely an option, I saw lots of articles on Medium on how to host serverless website from S3 using API Gateway. Hosting cost is close to $0.

Another, possible approach maybe, is github with repository from which we can server a website similarly as we do with SiskinIM and BeagleIM?

In any case, if you are interested in researching this topic, please go ahead. However, serving static files and pages from "serverless" is something we will need for a few other projects, so after this research we need a good solution applicable to other cases with some good description on how to deploy it.

If you are not interested in the topic, please let me know. Somebody else will be working on this.

Andrzej Wójcik (Tigase) commented 6 years ago

@kobit Usage of GitHub Pages is a possible solution (like in BeagleIM/SiskinIM sites), if we would convert the documentation to markdown and keep it in gh-pages branch. But this will leave us only with SNAPSHOT versions of the documentation.

Of course, we could generate HTML documentation from AsciiDoc on TC and push it to gh-pages branch as well, but I'm reluctant to that. It would increase load and size of the repositories which is not a good idea.

As we discussed with @wojtek , it could be possible to upload generated versions of documentation for released version as (ie. zipped) attachments to the Releases pages on GitHub, but it will make it impossible for us to point users (ie. by pasting a link) to the particular pages in this documentation.

Due to that, usage of a service, like S3 or maybe "hosted WWW" (ie. AWS Amplify Console). I suppose that the usage of S3 would be the simplest solution and I would like to go with it.

Artur Hefczyc commented 6 years ago

Yes, I agree with all your assessment and I also think that AWS with S3 and some serverless is the best approach. S3 costs close to nothing for the amount of data we want to store. If you want to investigate on this topic, please feel free to do so. However, we are not looking for a "simplest" solution. We are looking for best/most optimal solution. I am hoping we can get rid of running server entirely and implement "serverless" solution. So, you can spend more time if you need to have a long-term solution which we can reuse for other stuff if we need.

Andrzej Wójcik (Tigase) commented 6 years ago

@kobit Ok, let me try getting a serverless solution based on S3 running for some projects and we will be able to evaluate if that is OK for us or we need something else. From my point, for static websites usage of S3 and maybe some markdown based fronted page (based ie. on GitHub Pages) looks like a possible and flexible solution, but only S3 based solution is also an option.

Andrzej Wójcik (Tigase) commented 6 years ago

@kobit @wojtek If we want to get rid of https://build.tigase.net then after I move all tasks to TC and create storage for documentation (ie. in S3), then we still need a place for Maven repository. If I recall correctly it is hosted on the same machine. What do we want to do with that? Move to the smaller machine? or maybe try to move to S3 as documentation? https://tech.asimio.net/2018/06/27/Using-an-AWS-S3-Bucket-as-your-Maven-Repository.html

I suppose that both solution have some pros and cons, ie. S3 may be cheaper while having separate repository and machine for it more flexible but we would need time to keep it running...

Artur Hefczyc commented 6 years ago

I think, the most important question here is what we need and what we want in a long run. If there is an argument to keep the server as repository manager we need to consider it.

However, in any case I am in favor of moving forward to trying set it up on S3 in serverless mode as a test and experiment. Let's see how it works and how it goes.

Wojciech Kapcia (Tigase) commented 6 years ago

@andrzej.wojcik as commented in the e-mail I would be reluctant to serve maven over S3 as this would be like previous solution we had and which prompted us to migrate to archiva (hellish maintenance: snapshot folder growing without limits being one of them). Even linked article mentions the same:

This solution would be valid for a solo developer or a small team. Although a better choice than checking in the resulting artifacts into the SCM, it has some limitations. You won’t be able to Search for an artifact, which is convenient when you need to find out the artifactId, groupId and version of a dependency. Another limitation is maintenance. As the snapshot folder grows overtime, you would have to manually remove older versions.

Apart from that having dedicated repository has the benefit of removing old snapshots when the version is released (handy).

Artur Hefczyc commented 6 years ago

These are all good points. Enough to convince me against S3 based as it would require manual maintenance which is exactly what I want to avoid.

@andrzej.wojcik if you think you can resolve these issues, please continue work and experiments to use S3 based solution.

Otherwise we have to consider other options.

There are some hosted offers on the market but they are quite expensive from what I saw. Much more expensive than hosting a dedicated machine on AWS just for this. But maybe I overlooked some offer at reasonable price.

Other possibilities are to actually have a machine on AWS to serve maven. Still we could use S3 as the files storage if possible as it would dramatically reduce storage costs.

Now the question would be what are the requirements for such a machine? I guess it can be pretty low spec. Maybe we could share some of the existing servers we have for devops? Some of them are very underused (hub, source) and could potentially handle additional workload.

Let me know what you think.

Andrzej Wójcik (Tigase) commented 6 years ago

@kobit I think that having a server for Maven repository is really useful, as @wojtek said. Right now we are using Archiva (https://archiva.apache.org/index.cgi) which if I recall correctly does not have support for S3 and is Java based so it requires some amount of the memory. I'm not sure what setting are we using currently.

Alternatively, we could try to use Sonatype Nexus 3 OSS (https://www.sonatype.com/nexus-repository-oss) which is free but requires a lot of RAM -> minimum heap 1.2GB and 2GB of direct memory! Which means that minimal memory of the host should be 4GB! However, it has support for storing maven artifacts on S3.

As for documentation hosting, I think that S3 will work quite well.

Wojciech Kapcia (Tigase) commented 6 years ago

@andrzej.wojcik I don't remember on top of my head but archiva is relatively low on resources. The biggest usage is generated by jenkins and running build jobs/tests. Without it we should be fine with rather low-resource machine. AFAIR space on the secondary, mounted drive is relatively cheap as well, we should most likely evaluate cost of machine+drive for current approach vs Nexus (bigger machine but lower cost for storage)

Andrzej Wójcik (Tigase) commented 6 years ago

@wojtek That's why I brought it up as AFAIR Archiva does not use a lot of resources.

Wojciech Kapcia (Tigase) commented 6 years ago

Right now Archiva process is using ~850M of resident memory so... relatively low.

From wrapper conf:

# Initial Java Heap Size (in MB)
#wrapper.java.initmemory=3
wrapper.java.initmemory=512

# Maximum Java Heap Size (in MB)
#wrapper.java.maxmemory=64
wrapper.java.maxmemory=512

Wojciech Kapcia (Tigase) commented 6 years ago

Right now Archiva process is using ~850M of resident memory so... relatively low.

From wrapper conf:

# Initial Java Heap Size (in MB)
#wrapper.java.initmemory=3
wrapper.java.initmemory=512

# Maximum Java Heap Size (in MB)
#wrapper.java.maxmemory=64
wrapper.java.maxmemory=512

Artur Hefczyc commented 6 years ago

Hm, maybe it is worth the effort to put it on the same machine as hub or source then. Hub is under a very low load, so it might be a good candidate. Source is under a low load too but I am not certain if we want to keep it for a longer time. If we do not find it useful we will get rid of it.

To put it on the same machine as Hub we would need a docker package for Archiva.

As for the S3, S3 can be mounted as a more or less normal filesystem, so it is kind of transparent for application. It does have some limitations though.

Andrzej Wójcik (Tigase) commented 6 years ago

We need to add redirects from /snapshot/ to /master-snapshot/ for documentation published under docs.tigase.net and check if other redirects are present.

Andrzej Wójcik (Tigase) commented 5 years ago

I think that this is done and can be closed.

Type	Task
Priority	Major
Assignee	Wojciech Kapcia (Tigase)
Spent time	0