forked from gitea/gitea
1
0
Fork 0
Commit Graph

18 Commits

Author SHA1 Message Date
zeripath 3ad62127df
Correctly handle select on multiple channels in Queues (#22146) (#22428)
Backport #22146

There are a few places in FlushQueueWithContext which make an incorrect
assumption about how `select` on multiple channels works.

The problem is best expressed by looking at the following example:

```go
package main

import "fmt"

func main() {
    closedChan := make(chan struct{})
    close(closedChan)
    toClose := make(chan struct{})
    count := 0

    for {
        select {
        case <-closedChan:
            count++
            fmt.Println(count)
            if count == 2 {
                close(toClose)
            }
        case <-toClose:
            return
        }
    }
}
```

This PR double-checks that the contexts are closed outside of checking
if there is data in the dataChan. It also rationalises the WorkerPool
FlushWithContext because the previous implementation failed to handle
pausing correctly. This will probably fix the underlying problem in
 #22145

Fix #22145

Signed-off-by: Andrew Thornton <art27@cantab.net>
2023-01-13 20:42:42 +00:00
Gusted b5383590de
Fix 64-bit atomic operations on 32-bit machines (#19531)
- Doing 64-bit atomic operations on 32-bit machines is a bit tricky by
golang, as they can only be done under certain set of
conditions(https://pkg.go.dev/sync/atomic#pkg-note-BUG).
- This PR fixes such case whereby the conditions weren't met, it moves
the int64 to the first field of the struct, which will 64-bit operations
happening on this property on 32-bit machines.
- Resolves #19518
2022-04-27 10:32:04 -05:00
zeripath c88547ce71
Add Goroutine stack inspector to admin/monitor (#19207)
Continues on from #19202.

Following the addition of pprof labels we can now more easily understand the relationship between a goroutine and the requests that spawn them. 

This PR takes advantage of the labels and adds a few others, then provides a mechanism for the monitoring page to query the pprof goroutine profile.

The binary profile that results from this profile is immediately piped in to the google library for parsing this and then stack traces are formed for the goroutines.

If the goroutine is within a context or has been created from a goroutine within a process context it will acquire the process description labels for that process. 

The goroutines are mapped with there associate pids and any that do not have an associated pid are placed in a group at the bottom as unbound.

In this way we should be able to more easily examine goroutines that have been stuck.

A manager command `gitea manager processes` is also provided that can export the processes (with or without stacktraces) to the command line.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-03-31 19:01:43 +02:00
zeripath 4e57bd1d30
Add number in queue status to monitor page (#18712)
Add number in queue status to the monitor page so that administrators can
assess how much work is left to be done in the queues.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-02-12 13:31:26 +08:00
zeripath f8b21ac04a
Simplify Boost/Pause logic (#18673)
* Simplify Boost/Pause logic

#18658 has added a check to see if we need to boost because there is still work to do
however the check is slightly complex and not ideal. There's no point boosting if
the queue is paused or can't scale. Therefore merge the two selects into one and add
a check to p.paused.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* And on resume add a zeroboost if necessary

Signed-off-by: Andrew Thornton <art27@cantab.net>

* simplify

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lauris BH <lauris@nix.lv>
2022-02-08 13:53:34 -05:00
zeripath df44017328
Restart zero worker if there is still work to do (#18658)
* Restart zero worker if there is still work to do

It is possible for the zero worker to timeout before all the work is finished.
This may mean that work may take a long time to complete because a worker will only
be induced on repushing.

Also ensure that requested count is reset after pulls and push mirror sync requests and add some more trace logging to the queue push.

Fix #18607

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-02-08 14:02:32 +00:00
zeripath 7ba1b7112f
Only attempt to flush queue if the underlying worker pool is not finished (#18593)
* Only attempt to flush queue if the underlying worker pool is not finished

There is a possible race whereby a worker pool could be cancelled but yet the
underlying queue is not empty. This will lead to flush-all cycling because it
cannot empty the pool.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Apply suggestions from code review

Co-authored-by: Gusted <williamzijl7@hotmail.com>

Co-authored-by: Gusted <williamzijl7@hotmail.com>
2022-02-05 20:51:25 +00:00
zeripath be77ede954
Change some logging levels (#18421)
* Change some logging levels

* PlainTextWithBytes - 4xx/5xx this should just be TRACE
* notFoundInternal - the "error" here is too noisy and should be DEBUG
* WorkerPool - Worker pool scaling messages are normal and should be DEBUG

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-01-29 20:52:37 +00:00
zeripath 713985b1a4
Prevent deadlocks in persistable channel pause test (#18410)
* Prevent deadlocks in persistable channel pause test

Because of reuse of the old paused/resumed channels in this test there
was a potential for deadlock. This PR ensures that the channels are always
reobtained.

It further adds some control code to detect hangs in future - and it
ensures that the pausing warning is not shown on shutdown.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* do not warn but do pause

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-01-26 01:09:57 +02:00
zeripath ab7f701671
Make WrappedQueues and PersistableChannelUniqueQueues Pausable (#18393)
Implements the Pausable interface on WrappedQueues and PersistableChannelUniqueQueues

Reference #15928

Signed-off-by: Andrew Thornton art27@cantab.net
2022-01-24 22:54:35 +00:00
zeripath a82fd98d53
Pause queues (#15928)
* Start adding mechanism to return unhandled data

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Create pushback interface

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add Pausable interface to WorkerPool and Manager

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Implement Pausable and PushBack for the bytefifos

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Implement Pausable and Pushback for ChannelQueues and ChannelUniqueQueues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Wire in UI for pausing

Signed-off-by: Andrew Thornton <art27@cantab.net>

* add testcases and fix a few issues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix build

Signed-off-by: Andrew Thornton <art27@cantab.net>

* prevent "race" in the test

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix jsoniter mismerge

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix conflicts

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix format

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add warnings for no worker configurations and prevent data-loss with redis/levelqueue

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Use StopTimer

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-01-22 21:22:14 +00:00
6543 54e9ee37a7
format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
zeripath ba526ceffe
Multiple Queue improvements: LevelDB Wait on empty, shutdown empty shadow level queue, reduce goroutines etc (#15693)
* move shutdownfns, terminatefns and hammerfns out of separate goroutines

Coalesce the shutdownfns etc into a list of functions that get run at shutdown
rather then have them run at goroutines blocked on selects.

This may help reduce the background select/poll load in certain
configurations.

* The LevelDB queues can actually wait on empty instead of polling

Slight refactor to cause leveldb queues to wait on empty instead of polling.

* Shutdown the shadow level queue once it is empty

* Remove bytefifo additional goroutine for readToChan as it can just be run in run

* Remove additional removeWorkers goroutine for workers

* Simplify the AtShutdown and AtTerminate functions and add Channel Flusher

* Add shutdown flusher to CUQ

* move persistable channel shutdown stuff to Shutdown Fn

* Ensure that UPCQ has the correct config

* handle shutdown during the flushing

* reduce risk of race between zeroBoost and addWorkers

* prevent double shutdown

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-15 16:22:26 +02:00
zeripath 0590176a23
Only use boost workers for leveldb shadow queues (#15696)
* The leveldb shadow queue of a persistable channel queue should always start with 0
workers and just use boost to add additional workers if necessary.

* create a zero boost so that if there are no workers in a pool - boost to start the workers

* actually set timeout appropriately on boosted workers

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-02 08:22:30 +01:00
zeripath c58bc4bf80
Prevent timer leaks in Workerpool and others (#11333)
There is a potential memory leak in `Workerpool` due to the intricacies of
`time.Timer` stopping.

Whenever a `time.Timer` is `Stop`ped its channel must be cleared using a
`select` if the result of the `Stop()` is `false`.

Unfortunately in `Workerpool` these were checked the wrong way round.

However, there were a few other places that were not being checked.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2020-05-08 16:46:05 +01:00
zeripath 88986746d5
Fix Workerpool deadlock (#10283)
* Prevent deadlock on boost

* Force a boost in testchannelqueue
2020-02-15 18:44:58 +00:00
zeripath c01221e70f
Queue: Make WorkerPools and Queues flushable (#10001)
* Make WorkerPools and Queues flushable

Adds Flush methods to Queues and the WorkerPool
Further abstracts the WorkerPool
Adds a final step to Flush the queues in the defer from PrintCurrentTest
Fixes an issue with Settings inheritance in queues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Change to for loop

* Add IsEmpty and begin just making the queues composed WorkerPools

* subsume workerpool into the queues and create a flushable interface

* Add manager command

* Move flushall to queue.Manager and add to testlogger

* As per @guillep2k

* as per @guillep2k

* Just make queues all implement flushable and clean up the wrapped queue flushes

* cope with no timeout

Co-authored-by: Lauris BH <lauris@nix.lv>
2020-01-28 20:01:06 -05:00
zeripath 62eb1b0f25 Graceful Queues: Issue Indexing and Tasks (#9363)
* Queue: Add generic graceful queues with settings

* Queue & Setting: Add worker pool implementation

* Queue: Add worker settings

* Queue: Make resizing worker pools

* Queue: Add name variable to queues

* Queue: Add monitoring

* Queue: Improve logging

* Issues: Gracefulise the issues indexer

Remove the old now unused specific queues

* Task: Move to generic queue and gracefulise

* Issues: Standardise the issues indexer queue settings

* Fix test

* Queue: Allow Redis to connect to unix

* Prevent deadlock during early shutdown of issue indexer

* Add MaxWorker settings to queues

* Merge branch 'master' into graceful-queues

* Update modules/indexer/issues/indexer.go

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

* Update modules/indexer/issues/indexer.go

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

* Update modules/queue/queue_channel.go

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

* Update modules/queue/queue_disk.go

* Update modules/queue/queue_disk_channel.go

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

* Rename queue.Description to queue.ManagedQueue as per @guillep2k

* Cancel pool workers when removed

* Remove dependency on queue from setting

* Update modules/queue/queue_redis.go

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

* As per @guillep2k add mutex locks on shutdown/terminate

* move unlocking out of setInternal

* Add warning if number of workers < 0

* Small changes as per @guillep2k

* No redis host specified not found

* Clean up documentation for queues

* Update docs/content/doc/advanced/config-cheat-sheet.en-us.md

* Update modules/indexer/issues/indexer_test.go

* Ensure that persistable channel queue is added to manager

* Rename QUEUE_NAME REDIS_QUEUE_NAME

* Revert "Rename QUEUE_NAME REDIS_QUEUE_NAME"

This reverts commit 1f83b4fc9b.

Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: techknowlogick <matti@mdranta.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2020-01-07 12:23:09 +01:00