Improving the compose: leave the current compose in place

Discussion:

Owen Taylor

2018-11-27 14:58:47 UTC

A lot of discussion about improving the compose process seem to end up
with a "reality check" - that ideas have already been tried but don't
work because of requirements a) b) c) d). You can't have the pony, but
maybe if a lot of effort is put into it, you can have a faster rocking
horse.

If want to fundamentally improve the Fedora workflow we need compose
ponies, we can't just have rocking horses!

Perhaps it would make sense to leave the current 8-10 hour compose in
place for the forseeable future, and work on a new system in parallel
where the primary constraint is to be as fast as possible. Hopefully
most problems with the slow compose will get sorted out in the fast
composes, and the slow compose will become more reliable. Perhaps in a
distant future, we can make the new system do everything

I don't know what the system would look like exactly, but you could
imagine things like:

* Composed of several micro-composes (micro-compose-services?) to
avoid blocking on everything completing successfully.

* Able to do speculative composes for CI

* Either x86_64-only, or with decoupled architectures so that we can
throw x86_64 hardware (or cloud resources) at it, and make it super
fast.

* No IO /mnt/koji during the compose - having a big network share be
central to the process creates a performance bottleneck, makes it hard
to move to the cloud, and potentially adds a lot of "noise" to
figuring out what is going on where things are slow because of some
other entirely different thing is goin gon.

Add your own bullet points :-)

Owen
_______________________________________________
devel mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to devel-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/***@lists.fedoraproject.o

Stephen John Smoogen

2018-11-27 15:11:15 UTC

Permalink

Post by Owen Taylor
A lot of discussion about improving the compose process seem to end up
with a "reality check" - that ideas have already been tried but don't
work because of requirements a) b) c) d). You can't have the pony, but
maybe if a lot of effort is put into it, you can have a faster rocking
horse.
If want to fundamentally improve the Fedora workflow we need compose
ponies, we can't just have rocking horses!
Perhaps it would make sense to leave the current 8-10 hour compose in
place for the forseeable future, and work on a new system in parallel
where the primary constraint is to be as fast as possible. Hopefully
most problems with the slow compose will get sorted out in the fast
composes, and the slow compose will become more reliable. Perhaps in a
distant future, we can make the new system do everything
I don't know what the system would look like exactly, but you could
* Composed of several micro-composes (micro-compose-services?) to
avoid blocking on everything completing successfully.
* Able to do speculative composes for CI
* Either x86_64-only, or with decoupled architectures so that we can
throw x86_64 hardware (or cloud resources) at it, and make it super
fast.
* No IO /mnt/koji during the compose - having a big network share be
central to the process creates a performance bottleneck, makes it hard
to move to the cloud, and potentially adds a lot of "noise" to
figuring out what is going on where things are slow because of some
other entirely different thing is goin gon.
Add your own bullet points :-)

Define what a compose is? Currently it is a word which covers a
multitude of different processes and reasons for those processes. We
can't 'fix' or even 'replace' or parallel them without actually
knowing why someone duct taped this tool to that widget during a 2 am
release window.

If the definition of a compose is pull out all the packages from koji
and put them together into a 'release' then your No IO is not
possible.
If the definitions is that it is every release target, then removing
things makes those things not composes. [That is ok, it just means
that when we call them jam making we know it covers what we want it to
be versus what someone else expects a 'compose' to do.]

Post by Owen Taylor
Owen
_______________________________________________
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines

--
Stephen J Smoogen.
_______________________________________________
devel mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to devel-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archiv

Owen Taylor

2018-11-27 15:20:40 UTC

Permalink

Post by Stephen John Smoogen
Define what a compose is? Currently it is a word which covers a
multitude of different processes and reasons for those processes. We
can't 'fix' or even 'replace' or parallel them without actually
knowing why someone duct taped this tool to that widget during a 2 am
release window.

Yes, that's a good point, and something I wanted to say - one of the
starting points is defining exactly what output is needed for a
particular use case and figure out how to get *that* output as fast as
possible. (Probably the normal case will be taking a new build, and
producing a single testable ostree or image containing that build.)

Being able to optimize along a single path is how you get *much*
faster than the current global compose. And why the problem is
probably not best formulated as "speed up the global compose".

Owen
_______________________________________________
devel mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to devel-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/***@lists.fe

Paul Frields

2018-11-27 18:59:57 UTC

Permalink

Indeed, this is basically the investigation I've proposed. I also think

Post by Owen Taylor
I don't know what the system would look like exactly, but you could
* Composed of several micro-composes (micro-compose-services?) to
avoid blocking on everything completing successfully.
* Able to do speculative composes for CI
* Either x86_64-only, or with decoupled architectures so that we can
throw x86_64 hardware (or cloud resources) at it, and make it super
fast.
* No IO /mnt/koji during the compose - having a big network share be
central to the process creates a performance bottleneck, makes it hard
to move to the cloud, and potentially adds a lot of "noise" to
figuring out what is going on where things are slow because of some
other entirely different thing is goin gon.
Add your own bullet points :-)

I would like to redefine a couple working assumptions:

* Big tools are unwieldy and inevitably silo knowledge. The people
behind them are often smart, hard-working, and care about great
results. But bedrock FOSS principles say we get more value from
rapidly iterating tools to which many people can/do contribute. We
should see if we can avoid big tools that solve everything.

* Reproducibility is something we can better enforce at development
time than use time. It's pretty easy to pick one or more git heads at
a certain time (for a tool, a containerized environment, etc.). Let's
not get one hand tied behind our back at the outset via outmoded
assumptions.

Every other bullet point on your list, Owen, I agree with 100%.

--
Paul
_______________________________________________
devel mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to devel-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproj

Peter Robinson

2018-11-29 12:22:37 UTC

Permalink

Post by Paul Frields

Indeed, this is basically the investigation I've proposed. I also think

* Big tools are unwieldy and inevitably silo knowledge. The people
behind them are often smart, hard-working, and care about great
results. But bedrock FOSS principles say we get more value from
rapidly iterating tools to which many people can/do contribute. We
should see if we can avoid big tools that solve everything.
* Reproducibility is something we can better enforce at development
time than use time. It's pretty easy to pick one or more git heads at
a certain time (for a tool, a containerized environment, etc.). Let's
not get one hand tied behind our back at the outset via outmoded
assumptions.

That is not entirely true. A level of reproducibility is also at build
time based on versions of other packages that the package has been
built against. The versions of components that another component is
built/composed against will greatly affect the reproducibility of a
component and that information is not in git.

Peter
_______________________________________________
devel mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to devel-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/d

Ken Dreyer

2018-12-01 19:46:13 UTC

Permalink

There have several efforts to improve Pungi performance over time. Is
there any email list or communication channel where this effort could
be coordinated?

I work on a project in Ceph that uses Pungi a lot, so I'm really
interested in making composes as fast as possible. Our Jenkins system
runs Pungi several times a day (every time a build completes in Koji),
so that we can deliver composes to QE immediately. I'd like to run it
even more frequently (like on every pull request scratch build).

Maybe we could write a dedicated page in Pungi's upstream
documentation, "performance tips for making Pungi as fast as
possible". It could explain the dogpile.cache stuff, hardlinks vs
http, etc.

This is a great idea, although it's a little tricky to do everything
in a local tmpdir and still take advantage of the speed that NFS
hardlinks provide.

Post by Owen Taylor
Add your own bullet points :-)

Another hairbrained idea: reduce or eliminate Pungi's thread model and
make it asynchronous, using https://github.com/ktdreyer/txkoji :)
_______________________________________________
devel mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to devel-***@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list