Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Integrate Fleetlock client with Locksmith #14

Open
wants to merge 15 commits into
base: flatcar-master
Choose a base branch
from

Conversation

aniruddha2000
Copy link
Contributor

@aniruddha2000 aniruddha2000 commented Sep 27, 2021

Integrate Fleetlock client with Locksmith

In this PR we are refactoring the Locksmith code base and integrating the Fleetlock client. We are using Go FleetLock HTTP client instead of the etcd lock client.

@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch from c0219aa to f712b04 Compare September 28, 2021 09:08
@aniruddha2000
Copy link
Contributor Author

@tormath1 need little code review 😄

@tormath1
Copy link
Contributor

@aniruddha2000 hi ! Thanks for your contribution - I see the CI is still failling with the following error:

go: inconsistent vendoring in /home/runner/work/locksmith/locksmith:
	github.com/flatcar-linux/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/godbus/dbus/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	go.uber.org/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/godbus/dbus/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
	golang.org/x/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod

	To ignore the vendor directory, use -mod=readonly or -mod=mod.
	To sync the vendor directory, run:
		go mod vendor
make: *** [Makefile:22: bin/locksmithctl] Error 1

@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch from f712b04 to 38395fe Compare September 28, 2021 15:12
@aniruddha2000
Copy link
Contributor Author

@tormath1 yes that's true. We have to inspect where is the problem. I think something is not syncing between the go.mod and vendor/modules.txt but for that do we have any option to fix it manually instead of go mod vendor?

@aniruddha2000
Copy link
Contributor Author

Seems like i fixed the interface definition :)

@aniruddha2000 aniruddha2000 changed the base branch from flatcar-master to tormath1-aniruddha/fleetlock-poc September 29, 2021 12:59
@aniruddha2000
Copy link
Contributor Author

aniruddha2000 commented Oct 16, 2021

Deamon.go -
In deamon.go daemon.go#L224 is failing because there is no New function in the LockClient.
Same error in the -
lock.go#L55
reboot.go#L59
set_max.go#L47
status.go#L46
unlock.go#L55

The getClient() locksmithctl.go#L186 function is dealing with EtcdLockClient and it's also dealing with endpoints as per I understand. I think we can replace the getClient() implementation with something like generateRequest implementation client.go#L46 as it also deals with endpoints.

set_max.go -
As per I understand it is for maximum number of semaphore that can be released but in current fleetlock there is only one lock that can be released. So I think we can remove the set_max and see what error we get by CI failing

semaphore.go -
We can remove the semaphore.go and semaphore_test.go as we are not dealing with semaphore.

@tormath1
Copy link
Contributor

@aniruddha2000 thanks for this investigation, this is an awesome job 💪

This commit: Call the RecursiveLock, UnlockIfHeld function in the Lock and Unlock looks good.

it's also dealing with endpoints as per I understand

This is a hack introduced in #11 to mitigate side effects from etcd/v2 upgrade - so it should be removed from now since we don't deal anymore with etcd.

For the getClient(), I think we can preserve the current logic (building the HTTP client) in order to return a FleetLock client (https://pkg.go.dev/github.com/flatcar-linux/fleetlock/pkg/client#New) - we can also rename Endpoints to be URL (the actual FleetLock server URL).

Since the FleetLock client needs an ID, we need to port this logic into the getClient: https://github.com/flatcar-linux/locksmith/blob/b54e4c68e0ac154402cb4d30511c70b02a6509f3/locksmithctl/lock.go#L45-L53

@tormath1 tormath1 changed the base branch from tormath1-aniruddha/fleetlock-poc to flatcar-master October 20, 2021 15:28
@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch from 84bdfce to c003348 Compare October 20, 2021 15:39
@aniruddha2000
Copy link
Contributor Author

aniruddha2000 commented Oct 20, 2021

@tormath1 we can lock.go#L55 change the New method calling with the fleetlock New implementation https://pkg.go.dev/github.com/flatcar-linux/fleetlock/pkg/client#New and the also same in the places where the same error is occurring.

@aniruddha2000
Copy link
Contributor Author

make: *** [Makefile:22: bin/locksmithctl] Error 2
I think this error is because build is not getting the test file etcd_test.go. We need to find the way to make the build ignore the removed tests.

@tormath1
Copy link
Contributor

@aniruddha2000 for your information, in the commit Remove loop for resilience... you can just run git revert 860103cdfbf42c57478af02f3db187a6a57195bb to revert the commit itself (860103c)

Regarding the build error:

$ make
# github.com/flatcar-linux/locksmith/locksmithctl
locksmithctl/locksmithctl.go:183:20: undefined: lock.EtcdLockClient
locksmithctl/set_max.go:54:20: l.SetMax undefined (type *lock.Lock has no field or method SetMax)
locksmithctl/status.go:48:15: l.Get undefined (type *lock.Lock has no field or method Get)
make: *** [Makefile:22: bin/locksmithctl] Error 2
  • locksmithctl/locksmithctl.go:183:20: undefined: lock.EtcdLockClient
    In the getClient() signature, we should have *fleetlock.Client in place of lock.EtcdLockClient

  • locksmithctl/set_max.go:54:20: l.SetMax undefined (type *lock.Lock has no field or method SetMax)
    With FleetLock, maximore of semaphore can be done on the server implementation side, so we can remove the set-max subcommand from the CLI.

  • locksmithctl/status.go:48:15: l.Get undefined (type *lock.Lock has no field or method Get)
    Same as above, we can't query the status so we can remove the status subcommand too.

Basically, locksmithctl will be use to lock / unlock and run as a daemon.

@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch 3 times, most recently from ad84eb8 to a5299d4 Compare October 21, 2021 16:27
@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch from a5299d4 to ee1ed57 Compare October 22, 2021 06:50
Transport: transport,
Username: globalFlags.EtcdUsername,
Password: globalFlags.EtcdPassword,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are failing because I can't see any Config struct implementation in the client package. Do we will be dealing with username and password? If not we can remove or if we want to keep the logic then we need other modification way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see any Config struct implementation in the client package.

It's almost this - actually, to create the fleetlock.Client, we need four things:

The idea of passing net/http.Client to the FleetLock client constructor is that we give freedom, in the implementation, to use custom Transport (SSL, etc.) but also any authentication required.

For Fleetlock, there is no dedicated authorization - so for now, we can just keep the transport to build our net/http.Client.

Do we will be dealing with username and password? If not we can remove or if we want to keep the logic then we need other modification way

We can keep them but it requires some modification on the fleetlock.Client to support basic authentication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Then let's implement the basic auth in the fleetlock project side by side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did: https://github.com/flatcar-linux/fleetlock/compare/tormath1/basic-auth I'll open a PR then let you know when it's done. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is opened and under review: flatcar/fleetlock#1 - expect some changes in the fleetlock/client.New :)

return lc, nil
lc, err := lock.NewEtcdLockClient(kapi, globalFlags.Group)
if err != nil {
return nil, err
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock package is not here and I think we can remove the NewEtcdLockClient and implement the https://pkg.go.dev/github.com/flatcar-linux/[email protected]/pkg/client#New instead.
What do you think @tormath1 :)

Copy link
Contributor

@tormath1 tormath1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the changes I commented, I succeed to compile locksmith 💪 so great work, we will soon be able to make some tests. :)

@@ -183,7 +179,7 @@ func main() {

// getLockClient returns an initialized EtcdLockClient, using an etcd
// client configured from the global etcd flags
func getClient() (*lock.EtcdLockClient, error) {
func getClient() (*client.Client, error) {
// copy of github.com/coreos/etcd/client.DefaultTransport so that
// TLSClientConfig can be overridden.
transport := &http.Transport{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be replaced with http.DefaultClient which was not introduced yet when this lines have been wrote.

This is an example with a copy of the default http client in case the transport must be overwrittent: https://github.com/flatcar-linux/fleetlock/blob/05e572675abd65352cd200040cb2fe2bf68c37fd/cmd/lock.go#L16-L21

@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch from ec3c1cc to 1ab32ac Compare October 26, 2021 16:30
@aniruddha2000 aniruddha2000 force-pushed the aniruddha/integrate-fleetlock branch from 3ffb93e to 226abd6 Compare October 26, 2021 19:19
getClient use the global flags and use the http DefaultClient to build
the fleetlock client and returns it
@aniruddha2000 aniruddha2000 changed the title [WIP] Update the LockClient Interface [WIP] Integrate Fleetlock client with Locksmith Nov 15, 2021
@tormath1
Copy link
Contributor

hey @aniruddha2000, IIRC the PoC was working fine as we did a demo during a Flatcar Community Call (https://www.youtube.com/watch?v=X_nqgXLOmLk).

I could have another round of review but IIRC tests were requiring some work - would you be interested to resume this PR ? Or I can take over if you don't have time to contribute on this :) You already did a great breakthrough, Just let us know !

- Remove semaphore.go & semaphore_test.go
- Move holder error[ErrExist & ErrNotExist] from semaphore.go to lock.go

Signed-off-by: Aniruddha Basak <[email protected]>
@tormath1
Copy link
Contributor

@aniruddha2000 overall it looks good. While writing a test for this (flatcar/mantle#336) - I realized we should get this feature smoothly lands into the OS to not break workloads.
I would come with this approach:

  • CLI: keep the current behavior but we could add an exp subcommand to play with fleetlock
locksmithctl exp --id kola-test-id --group default --endpoint http://localhost:3333 lock
  • daemon: we could configure the daemon with a dropin to enable experimental feature (like fleetlock):
LOCKSMITHD_EXP=true

With this approach, we could slowly integrate the new feature while preserving the current behavior. What do you think ?

@pothos
Copy link
Member

pothos commented Jun 1, 2022

I don't think introducing something as "experimental" is a good way because it would itself become a standard and people rely on it this way - let's find a good final solution instead from day one.
Can we rather add a --fleetlock-endpoint=http://127.0.0.1:3333 flag that works as alternative to --endpoint= for direct etcd, or even better use another URI scheme like --endpoint=fleetlock://127.0.0.1:3333?

@tormath1
Copy link
Contributor

tormath1 commented Jun 1, 2022

Yeah I really like this approach (--endpoint=fleetlock://127.0.0.1:3333) - that would simplify the codebase (no need to add a new sub-command) and the configuration itself.

@invidian
Copy link
Member

invidian commented Jun 1, 2022

Does the fleetlock endpoint use HTTP? If so, I'm not sure about using the fleetlock URI scheme, as it often describes the protocol used for the communication.

EDIT: https://coreos.github.io/zincati/development/fleetlock/protocol/ says it uses HTTP. I guess it can also use HTTPS as well then? How would you configure HTTPS when using fleetlock URI scheme?

@tormath1
Copy link
Contributor

tormath1 commented Jun 1, 2022

Does the fleetlock endpoint use HTTP? If so, I'm not sure about using the fleetlock URI scheme, as it often describes the protocol used for the communication.

EDIT: https://coreos.github.io/zincati/development/fleetlock/protocol/ says it uses HTTP. I guess it can also use HTTPS as well then? How would you configure HTTPS when using fleetlock URI scheme?

@invidian thanks for jumping in ! The idea behind fleetlock:// is to let the user decides if he wants to use or not etcd or fleetlock to ensure backward compatibility.

@invidian
Copy link
Member

invidian commented Jun 1, 2022

Yeah, I get that and I see how re-using --endpoint CLI arg makes it simple to handle. But I'm pointing out it comes with nasty limitations like I mentioned, which might be difficult to overcome later. Though maybe just using fleetlocks:// would be fine if there is a need in the future 😄

@pothos
Copy link
Member

pothos commented Jun 1, 2022

One could also use fleetlock+http:// and fleetlock+https://

@invidian
Copy link
Member

invidian commented Jun 1, 2022

Good idea @pothos. So it can be:
fleetlock -> http
fleetlock+http -> http
fleetlock+https -> https

Do you have some other examples of + in URI schemes? Is it something more wildly used? https://en.wikipedia.org/wiki/List_of_URI_schemes only mentions stratum+tcp I think.

@pothos
Copy link
Member

pothos commented Jun 1, 2022

I think there is git+ssh

@tormath1
Copy link
Contributor

tormath1 commented Jun 1, 2022

I think there is git+ssh

yeah I seen that too in some libvirt documentation: https://libvirt.org/uri.html#remote-uris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants