Relay Server autodiscovery

As far as I understand, currently the Relay Server setting is static for the devices.
I.e. device will try to work with the same Relay Server unless specifically reconfigured for another (for which you need the Relay Server BLOB).

Is there a way to change Relay Server automatically based on, say, WLAN profile used of anything like that? Or address relay server as a domain name and let DNS handle the job (not a good solution for Windows Mobile). What is the recommended design, then, is device roams between sites that have different Relay Servers and needs to access those?

All I can think of now is using external tools to monitor the situation and run MSP Client with the task of installing proper Relay Server BLOB.
Juan-Antonio Ma...
I guess Relay Server address

I guess Relay Server address is in the Staging Barcodes, so you don't need to have connectivity to first RS to change to a second one.
Actually we simply do so in Correos: MC75 are staged at a given depot by the partner, and to "attach them once delivered you simply read a staging barcode which "contains" the new one's address.
Vote: 
Vote up!
Vote down!

Points: 1

You voted ‘up’


Arsen Bandurian
I understand you can just re

I understand you can just re-stage the device, but I want something fully automated.
I want the device to "understand" that under these conditions it should use different relay server
I see two reasonable scenarios:
* Device moving between sites that needs to talk to Relay Serves often for whatever reason
* Relay Server redundancy (if RS is down/deleted/reconfigured before the devices were updated for any reason)

I realize that these situations will not happen often and you can always remote control into devices and initiate on-demand/automatic electronic staging over the known network (but this still involved human intervention), so this is more of a theoretical question to understand the current system capabilities/limits and prepare for possible questions.
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Arsen Bandurian
Basically, this could answer

Basically, this could answer my question. Let's assume I need to flip between just two Relay Servers: RS1 and RS2.
Using automated staging functionality with multiple profiles and running the job based on Connectivity conditions I can create something akin to "if RS1 is not contactable for X time AND RS2 is contactable for Y time - run MSP Client executable to automatically stage the device with RS2". And the same for flipping back to RS1. Obviously, for this to work I must be always non-compliant, so it is a bit more complicated.

Questions:
1. Is it technically possible via MSP-only means (no external utilities)? Does it actually make sense?
2. Normally, to run a job on the device you must have this job first pushed to the device by MSP. But whenever my device flips the RS the job will be completed. So there's no way to flip RS twice before actually contacting the Relay Server (to report completed job, become non-compliant again and get a new one). Any workarounds
3. Any other caveats?
4. What about abandoned/orphaned Discovery Docs and Jobs left at Relay Servers that device is no longer using?
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Juan-Antonio Ma...
Sorry I can't answer these

Sorry I can't answer these questions, but I wonder if having different intranet DNS servers on different sites (which do not have connectivity between them) "pointing" to different Relay Servers but always with the same FQDN name could help you in this.
So using always the same name will actually use different RS depending on the site.
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Michael Hennel
MSP relies on the relay

MSP relies on the relay server attribute to determine where to send jobs. MSP also assumes that the Relay server is a single machine with a single filesystem. If a device transferred between Relay servers and the servers file system were not exactly replicated, jobs could time out and devices could be unable to recover.
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Juan-Antonio Ma...
I assume Arsen's problem is

I assume Arsen's problem is simply connectivity/visibility and all the RS are "cloned".
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Arsen Bandurian
No, they actually can be

No, they actually can be different. This is why I asked what happens with orphaned jobs/etc..
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Allan Herrod
Arsen; You are really in an


Arsen;

You are really in an area which MSP was not designed to handle.  I can think of some ways to do what you are describing, but all of them would require stretching the MSP model in ways it really wasn't designed to work.  To arrive at a robust solution, you would need to think through all the cases you want to handle and make sure they are all covered, including error situtions.  The more complex the scenarios, the more difficult it will be to get it all working reliably.  There is really no "one size fits all" solution for this kind of situation.

A lot depends on the architecture of the network as well.  If the network topology is such that a device can still reach its old Relay Server even after moving to a new location, that will simplify things.  In such a case, when a device roams, it can report information to MSP though its old Relay Server that can be used to trigger sending a Job to move it to its new Relay Server.  This will likely result in a clean handoff.  If the network topology is such that a device cannot reach its old Relay Server even after moving to a new location, then things get more complicated.  In such a case, the logic to move to a new Relay Server would need to be entirely autonomous on the device.  That would require preloading the new Relay Server Settings BLOBs on the device and using a LockAndWipe Settings, with appropriate Conditions, to apply the new Relay Server Settings when needed.

Alternately, in situations where all locations are configured similarly, for example, every Site uses NAT and has the same IP addresses, the Relay Server address may be the same at every Site.  In such cases, if the same WLAN Settings are used at each Site, then when a device roams from one Site to another, it will automatically roam to the new Relay Server without even knowing it, since it will reach the new Relay Server using the same IP address.  This kind of situation can also be achieved when using DNS names to access the Relay Server and using local DNS name resolution to map the same name to differnet addresses at different Sites.  In such cases, when a device roams, there MAY well be orphaned Jobs because the device fails to check it and perform a Job sent to the old Relay Server.  Eventually, MSP will stale out the old Job and a Job can be sent to the new Relay Server.  This is not perfect right now, and we are considering enhancements to make MSP more aware of this, so when the Relay Server changes, MSP will abandon Jobs pending on the old Relay Server right away and immediately send Jobs to the new Relay Server.

As for redundancy in the event that a Relay Server goes down, this is something we have been asked about many times and is something that really would need to be handled at a more global system level.  Having a device switch over from one Relay Server to another due to such a failure will likely not work very well unless MSP is similarly switching and coordinating the handoff.  As it stands right now, if a Relay Server goes down, management of devices that are reliant on that Relay Server stops until that Relay Server can be recovered.  MSP is designed so that this will have minimal negative impact on the non-management activities of the device.  But to reestablish management, it will be necessary to recover or replace the failed Relay Server with a comparable Relay Server with the same address, or with the same network name.  Or, it will be necessary to restage the device to use a new Relay Server.

If you want to discuss specific scenarios for a specific customer, and how they might be addresses, I would be happy to work with you.  But a generic MSP solution is likely pretty far off because of the vast array of possible situations.

Allan
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Arsen Bandurian
Allan, thanks for an extended

Allan, thanks for an extended reply.

I am interested in just two scenarios:
1. Device moved to a new site with different RS (different IP, name, password). Old RS is not reachable. The task is to make device work with new RS (like, having a blob on the device).
As far as I understand you suggest using LockAndWipe scenarios to determine connectivity and then execute MSP Agent with instruction to stage a new RS. Right?
If this is technically achievable w/o any 3rd party tools this sounds pretty much like an answer to me. So, is it? If so, I'll try and implement it, but I want to be sure I'm not wasting time on something that's impossible.

2. Since you mentioned it. RS is down and all data was lost (including job files from both MSP Server and Clients). A new RS is up with the same parameters, but totally empty. Once MSP sees that the RS is up again, it will (I believe) automatically create the necessary folder structure and put the necessary package and RS info files. Questions:
 2.1 Will MSP Server put its job files back _immediately_? (I think no, because otherwise the same job may get executed twice on a device).
 2.1.1 If yes - how a device knows that it is not a dupe?
 2.1.2. If no - will MSP wait for a certain timeout/etc to put a job back or human intervention is required? (I believe, it will evaluate compliance and generate jobs based on that, but is there any defined timeout for that or smth like this)?
2.2. Will there be any disturbances from the device (MSP Client) side? I believe no (MSP client will just upload new Discovery Doc, see that there are no jobs and this is it).
2.3. What events should happen for "normal" operation to resume (assuming there were job files lost then old RS crashed). I.e. what sequence of events (and timeouts) would happen before a device is "seen" again by MSP and a job gets to the RS? I believe first the device should check into the RS, then the MSP should poll the RS, find the device to be non-compliant and create a job. Am I missing anything?
2.4. Again, what about completed job reports? Is there a way to make sure MSP won't send a job to a device again, because a report was lost? The manual is not very clear about how devices report job status to MSP Server (another job file? same edited job file? part of discovery doc?).

Thanks. I believe if all of these questions are answered, I'll be totally content :)
Vote: 
Vote up!
Vote down!

Points: 1

You voted ‘up’


Allan Herrod
Arsen; MSP will rebuild a


Arsen;

MSP will rebuild a Relay Server if it finds it has become "empty".  But MSP will NOT resend Job files that it previously sent before the Relay Server failed.  MSP will send any NEW Jobs that are generated after the Relay Server is rebuilt.  Similarly, if a device moves from one Relay Server to another, MSP will NOT resend Job files to the new Relay Server that were previously sent to the old Relay Server, but MSP WILL sent any NEW Jobs that are generated after the move to the new Relay Server.

Basically once a Job file is sent, its sent.  If MSP does not see the Job status change (which is indicated by renaming the file as explained in the Synchronous Execution section of the Using MSP document) within the Job stale timeout, then MSP will time the Job out.  It really doesn't matter why a Job never gets processed by a device, MSP will not resend it and eventually it will time out.  Once a Job times out for a device, you will need to click Reprocess for the Policy before MSP will send a new Job for that Policy.  You can change the Job timeout via the Admin Tool.

Generally, a device really doesn't care if a Job is a duplicate and has no way to detect sich a thing.  A Job is a Job and if the device sees a Job, it executes it.  In most cases, the Bundle referenced by a Job would not use Force Install and hence any Packages that are already present would not be reloaded.  But any Install Steps that have Force Install WOULD be repeated if a duplicate Job ever occured.  But MSP only sends any one Job ONCE, hence a duplicate cannot really ever occur.  So, the current handling is just how it deals with an intentional repeat of the same Bundle, such as by manual Policy push or Action execution.

All that said, and getting back to the original question, there really is no generic way to do Relay Server "autodiscovery" since that would imply that somehow the device could find its Relay Server AND get the credentials it needed to access it.  But if you know that a device could move between a small number of KNOWN Relay Servers, then you could have it switch programmatically, without having to write code.

Put the BLOB files for the Relay Server(s) you want to switch to into a Package, along with a Batch File for each to launch the RD Client  with the proper command line to apply each BLOB.  Be sure to deploy that Package before you send down LockAndWipe Settings that are going to use that stuff.  You can export the Relay Server BLOB files you need from MSP using Transfer->Export Relay Server BLOBs.

Now, the hardest part is deciding WHEN to apply a given Relay Server Settings.  I can't really help you there, but perhaps it could be done based on what WLAN you are on?  Or whether you are on WLAN or WWAN or whatever.  Anyway, you need to come up with a way to know when and create a Condition Object to detect that situation.  Then make a LockAndWipe Settings Object to check for that Condion and perform a Wipe operation based on it.  And as the ONLY Wipe Action, run the appropriate Batch File.  If you have more than one Relay Server to set, have suitable Condition for each and a different Wipe Scenario for each.

That's about the best I can suggest.

Allan
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Arsen Bandurian
I think that is sufficient.

I think that is sufficient. Thanks Allan.
Vote: 
Vote up!
Vote down!

Points: 0

You voted ‘up’


Log in to post comments