This isn't a fully worked out doc, but an attempt to collect some ideas.


Conventions/Notation
====================


1. Gimbals
--------------------

For the purpose of this proposal we use the following conventions:


# * gimbal
#
#  The physical device itself. 


Gimbals can come in a large variety of different types and with different capabilities. It can be a simple gimbal which only provides pitch and pan control, e.g., via an RC input. It also could provide more advanced features, such as horizon drift compensation, ROI, tracking, and so on, controlled by e.g. a serial UART or CAN input using a propietary protocol. It also can have a MAVLink interface, which in turn also can be of different levels of sophistication. For instance, it could just support the minimally required message set, or also advanced messages. In addition, the gimbal might have further functions such as camera support, again either through a non-Mavlink interface or through Mavlink. The gimbal also can come as 1-axis, 2-axis or 3-axis gimbal, and in different configurations such as pitch-roll or pitch-yaw in case of 2-axis gimbals or pitch-roll-yaw or roll-pitch-yaw or roll-yaw-pitch in case of 3-axis gimbals.

This gives rise to these conventions:


# * Mavlink gimbal
#
#  Gimbals which are used via a Mavlink interface


# * non-Mavlink gimbal  
#
#  Gimbals which are not controlled via Mavlink but e.g. PWM, a propietary serial protocol, etc.


The minimum capability of a gimbal - under the terms of this proposal - is pitch control. An example would be the 3DR Solo gimbal. Here the pan control is taken over by the copter, that is in order to point the gimbal to an orientation the pitch is done by the gimbal and the pan is done by the copter. The pan angle is accordingly in absolute world coordinates. Another example would be a pitch-roll 2-axis gimbal, where the pan control is obviously taken over by the copter.

The next level of capability is pitch and pan control. As regards pan, two cases need to be distinguished:

(1) The gimbal is following the orientation of the vehicle but it can be made to turn around by sending a yaw angle to the gimbal. For 3-axis gimbals, this mode is often called "pan", "follow" and similar. The yaw angle is thus relative to the vehicle orientation.

(2) The gimbal's pan is independent of the vehicle's orientation, but the gimbal can also be made to turn around by sending a yaw angle to the gimbal. This mode is often called "hold", "lock" and similar. The yaw angle is thus absolute in world coordinates.

This gives rise to these conventions:


# * follow
#
#  The modus operandi in which an orientation axis of the gimbal is kept aligned with the respective axis of
#  the vehicle is called "follow" mode.


# * lock
#
#  The modus operandi in which an orientation axis of the gimbal is independent of the respective axis of the
#  vehicle is called "lock" mode. 


# * relative yaw
#
#  The yaw angle send to the gimbal, or the yaw angle reported by the gimbal, is relative to the vehicle's yaw.


# * absolute yaw
#
#  The yaw angle send to the gimbal, or the yaw angle reported by the gimbal, is absolute in world coordinates.


# * has yaw control
#
#  A gimbal which can be turned around the yaw axis via some control command (PWM, serial, Mavlink) is said
#  to "have yaw control". For gimbals which do not have that property, pan control has to be taken over by the
#  vehicle. 
 

The most typical operation mode is look-lock-follow, where the pitch and roll axes are stabilized with respect to the world coordinates (lock), and the yaw axis is stabilized with respect to vehicle's yaw orientation (follow).

The most typical angle coordinate system is thus absolute-absolute-relative, where the pitch and roll angles are with respect to the world coordinates (absolute), and the yaw angle is with respect to the vehicle's yaw (relative).

However, many other combination could also be realized. For instance, a gimbal could be in lock-lock-follow mode, but the yaw angle could be absolute. 

The minimum capability of a gimbal under the terms of this proposal is the lock-lock-follow mode. This excludes e.g. 1-axis pitch gimbals. It includes however e.g. the 3DR Solo gimbal or a pitch-roll 2-axis gimbal.

There is this important twist:

"Yaw lock" and "absolute yaw" can only be provided by relatively sophisticated gimbals and requires support from e.g. the autopilot. The technical reason is the gyro drift, which for the yaw axis cannot be corrected for by using an accelerometer. Gimbals relying on only 6DOF sensors can thus only reliably be operated with the yaw axis in follow mode and with relative yaw angles (such gimbals often nevertheless can be put into yaw lock mode, but they can't be operated reliably so for an extended period of time because the yaw drift will make it lose its orientation with time; this is not considered a reliable operation under the terms of this proposal).

There is this second important twist:

The vehicle orientation is specified in NED, and the usually chosen Euler angles have their gimbal lock for pitch up, down 90°. However, most gimbals won't use this set of Euler angles, for mechanical reasons, or practical reasons such as that the camera should be pointable downwards. In many practical cases the different Euler angles don't produce issues, e.g. then the gimbal stabilizes roll to zero and only pitch and yaw (tilt and pan) is controlled. However, in general the incompatibility of the different Euler angles leads to all sorts of intricacies and pain points, which must not be ignored and properly accounted for in a protocol.

This gives rise to these conventions:


# * Euler angles
# 
#  Any of the 12 possible sets of three angles which are possible to specify attitude is called Euler angles. The
#  distinction into e.g. proper or improper, Euler and Tait-Bryan or Cardan angles, and so on, is not adopted. 


# * vehicle Euler angles
#
#  If needed for clarity, the Euler angles used to specify the vehicle's attitude are called "vehicle Euler angles".
#  This is the set known as e.g. "pan, tilt, roll", "roll, pitch, yaw" or more precisely as Z-Y-X, x-y-z or 321.


# * gimbal Euler angles
#
#  If needed for clarity, the Euler angles used by the gimbal are called "gimbal Euler angles".


The notation "pan", "tilt", "roll", "pitch", "yaw" is also often used for denoting the gimbal Euler angles, but it should be clearly understood that they usually refer to a different definition of Euler angles; the similar names should not be assumed to imply similar meaning.

The gimbal of course also may support more advanced situations or provide more advanced features than pitch and pan control and lock-lock-follow.

This gives rise to these conventions:


# * simple gimbal
#
#  A gimbal which only provides pitch and pan control, lock-lock-follow mode, and relative yaw. Pan control
#  can be via the vehicle.


# * smart gimbal
#
#  A gimbal which in addition at minimum supports yaw lock and absolute yaw angles. 


Gimbals which do not at least classify as simple gimbal, i.e. fall short of providing this minimal set of features, are not considered in this proposal.

Smart gimbals usually need support from e.g. the autopilot, i.e. receive specific messages from it (or make use of some additional sensors beyond 6DOF gyro-accelerometers). They usually also provide horizon drift correction to achieve stable horizon also in sharp maneuvers.

The gimbal may support even more advanced features, such as ROI, targeting, camera features, and so, but no notation is introduced for them.



2. System
--------------------

A vehicle typically consists of multiple hardware devices which are physically connected through communication links.

This gives rise to these conventions:


# * device
#
#  A physical piece of hardware which is connected through physical communication links to other devices.


Note that this not only includes the pieces of hardware located on the vehicle, but also e.g. a tablet running a ground control station, which is connected to the vehicle via some telemetry link. That is, in this notation a GCS is a device.


# * component
#
#  Piece of software running on a piece of hardware which behaves like a component under the terms
#  of the MAVLink specification.


Note that this also includes the ground control station software. That is, a GCS software is a component.

A component by definition thus has to support the minimal MAVLink messages, e.g., are required to send out MAVLink heartbeats and have their own unique component ID.

Note the distinction between device and component. For instance, a device may represent itself in the MAVLink network as two separate components. A typical example would be an integrated gimbal/camera. A companion computer may or may not be a component under the terms of this proposal, depending on what software it runs. On the other hand, an autopilot is usually a device which runs only the autopilot component, in which case the distinction is superfluous.


# * system
#

#  All components which are logically connected by a common MAVLink system ID. 


So, formally, "vehicle" refers to a network of physically connected devices and "system" refers to a network of logically connected components. However, the words "system" and "vehicle" may be used tautologously or interchangeably.


# * gimbal component
#
#  Piece of software running on a device which behaves like a component under the terms of the MAVLink
#  specification and which in addition handles gimbal-related MAVLink messaging. It usually has a gimbal
#  component ID (but doesn't have to according to the MAVLink spec).


Note the distinction between gimbal and gimbal component. The gimbal component will typically be located on the gimbal, i.e., the gimbal component software will typically run on the gimbal device, but this doesn't have to be so. For instance, the gimbal component could be located on the autopilot (which then has to emit two heartbeats, the autopilot and gimbal heartbeat), or could be located on a companion computer.  

Note also that a gimbal component doesn't necessarily have to exist under the terms of this proposal (vide infra). 

A non-Mavlink gimbal must be physically connected to another Mavlink-capable device in order to make it accessible to the MAVLink network. Therefore, on this device some piece of software must be running which handles the communication with the non-Mavlink gimbal.

This gives rise to this convention:


# * gimbal driver
#
#  Piece of software running on a device to communicate with a non-Mavlink gimbal. 


Note the distinction between gimbal driver and gimbal component. For instance, in many cases the non-Mavlink gimbal will be connected to the autopilot, and the autopilot will integrate the gimbal driver, but not also a gimbal component. In this case a gimbal component does not actually exist, since the autopilot component can control the gimbal via the integrated gimbal driver.

There is this twist:

In principle, the gimbal driver could also be located "elsewhere" and not necessarily on the autopilot or gimbal component. This would require some additional pass-through-type ultra-low-level MAVLink messaging. It's not yet clear if this should or should not be supported under the terms of this proposal.

Therefore, essentially two situations exist:

(1) A gimbal component is present, which is the targeting point for gimbal handling in the MAVlink network.

(2) A non-gimbal component with a gimbal driver is present, which is the targeting point for gimbal handling in the MAVlink network.



3. Control Functions
--------------------

For the purpose of this proposal we use the following conventions:


# * attitude/orientation
#  
#  The gimbal (or gimbal+vehicle) moves such as to maintain the specified camera orientation


Note that the camera may not actually be aligned with the specified camera orientation. This will depend on whether the axis is in follow or lock mode. In follow mode the specified camera orientation is relative to the vehicles orientation.

# * ROI
#  
#  The gimbal (or gimbal+vehicle) moves such that the camera points at the specified ROI


Note that the ROI may be specified in absolute or relative coordinates (which is a distance). Also, the specified ROI may not be the actual point the camera is pointing to, e.g. if some nudging is applied, or depending on follow or lock modes.

In contrast to the attitude/orientation control, which is a basic function of any gimbal, ROI support can be achieved in at least two different ways:

(1) The gimbal itself is not able to do ROI. This means that another piece of software running on some other device must do the required math and send the gimbal the respective attitude/orientation information. This proposal will specify that this task is done by the gimbal manager (vide infra).

(2) The gimbal itself can do ROI. This usually means that the gimbal needs to be supplied with some location/position data stream.

It can happen that there are several sources which desire to control the gimbal. For instance, a ROI may have been set plus some joysticks are available.

This gives rise to these conventions:


# * nudging
#
#  The orientation information from different sources are "added up" to a final orientation of the gimbal.


# * overriding
#
#  The orientation information from only one source is determining the actual orientation of the gimbal.


Obviously, these functions may easily induce conflicts, e.g. when two sources want to override. This proposal will specify that it's the gimbal manager's responsibility to deconflict (vide infra).



4. Messaging
--------------------

The data conveyed in messages can be classified as static or dynamic. Static data are those pieces of information which usually do not change during runtime, such as e.g. the FOV of a camera or the memory size of a SD card. Dynamic data are those pieces of information which may change during runtime, such as e.g. the voltage of a battery. 

Note however that dynamic data not necessarily have to change in a particular run. It may change, but also may not change. The possibility to change is what qualifies it as dynamic, not the actual change.

Albeit not strictly obeyed in the MAVLink specification, there appears to be a tendency to call messages conveying static data INFORMATION and messages conveying dynamic data STATUS.

This gives rise to these conventions:


# * INFORMATION message
#
#  Message which exclusively conveys static data.


# * STATUS message
#
#  Message which predominantly conveys dynamic data (but may also convey static data if of significant advantage).



5. Conflicts
--------------------

There can be several components in the system which may want to control the gimbal, and do so by sending messages to the gimbal.

This obviously can create situations there the gimbal recieves messages with inconsistent information from the different components, and hence may behave unintendendly or unexpectedly from the viewpoint of one of the components (and ultimately the user). Thus some sort of decission making or prioritizing or whatever is required in order to establish proper gimbal operation and prevent the gimbal from seemingly behaving erratic.

This gives rise to this convention:


# * deconfliction
#
#  Process by which it is ensured that a gimbal behaves reasonably at all times.


As regards the possible deconfliction strategies, there is this important twist:

One approach to deconfict is through cooperativity, that is, by requiring that each component behaves such that conflicts do not occur. This however implies that each component knows about what other components are doing and/or how they are behaving. Such a scenario could be called "distributed responsibility". It is the approach which by and large had been followed in the past, but it did not provided the best results: Standardization and coordination of all the software pieces and teams involved is difficult, and the approach also makes it difficult to add new features.

One also could think that deconfliction could be achieved through priority in time. That is, the gimbal always simply just does what the last received message was asking for. However, since the information comes via MAVLink messages/commands at certain points in time, and any component may send them at any time, it is easy to see that such a priorization scheme can't actually resolve conflicts. It is thus not a viable option.

Not all messages may however induce conflicts. A simple example would be the messages to read and set parameters, other example sowuld be various INFORMATION and STATUS messages, or thecommands to request them. Therefore the set of messages related to gimbal operation can be distinguished into those which may induce a conflict and those which may not. The messages which may induce conflict generally are of the type that they control the gimbal in some way, e.g., to set the orientation, or to change the operation mode.

This gives rise to this convention:


# * GIMBAL CONTROL message
#
#  Message which may induce a conflict.


Only gimbal control messages need to be subjected to deconfliction.



Proposed Architecture
=====================

A primary goal of any protocol must be to provide means to deconflict. The proposal here is to require that:

(1) Proper gimbal operation must be ensured at all times, and not only for (short or long) periods of times.

(2) Proper gimbal operation must not depend on the cooperativity of other components in the system; it must be ensured even if some components are not "well behaving" (e.g. do not follow the standard).

This can be achieved by these deconfliction rules:

(1) One and only one player is responsible for deconfliction for an extended period of time (i.e. not just momentarily or until the next message).

(2) The deconfliction responsibility can be taken away from that player only with the consent of this player, or by a player with overruling powers.


In order to achieve such robust deconfliction, and yet accommodate gimbals with varying degrees of capabilities and various hardware setups, the proposal is to introduce a gimbal manager, which is responsible for deconfliction, and which allows us to split up the communication into essentially two parts.


# * gimbal manager
#
#   A piece of software running on any of the devices in the network, which is responsible for
#   deconflicting gimbal control messages.


# * extraneous component (stupid name, I know)
#
#  Any other component interested in controlling the gimbal (i.e. any component which is not the
#  gimbal manager or the gimbal component).


There are accordingly introduced two sets of MAVLink messages:


# * gimbal message set
#
#  Messages exchanged between the gimbal manager and the gimbal component.


# * gimbal manager message set
#
#  Messages exchanged between the gimbal manager and any extraneous components (any other component
#  interested in controlling the gimbal).

 
The messages of the gimbal set are typically of lower-level nature, while the messages of the gimbal manager set are typically of higher-level nature, in the sense that the gimbal manager messages provide a more user-friendly API.


Deconfliction is achieved by requiring these two rules, our "Two Commandments":

(1) There might be several gimbal managers in the system, but only one gimbal manager can be active at a time.

(2) The gimbal component must act only upon the gimbal control messages which come from the gimbal manager. It nmust ignore any gimbal control messages coming from extraneous components, and it must ignore any control message of the gimbal manager set.

Notes:
* The Two Commandments effectively mean that the gimbal manager and only the gimbal manager has control over the gimbal component.

* The particular behavior or implementation of the gimbal manager is NOT specified as part of this proposal. Fundamentally, as long as it achieves the outlined tasks, there are no rules as regards to how it has to achieve that. That is, for as long as it adheres to the Two Commandments it shall be considered a valid implementation under the terms of this proposal. This implies that the exact system behavior can and will depend on the particular flight stack or implementation.

* The non-control messages are not subject to deconfliction, and thus need not be handled by the gimbal manager, i.e. may bypass the gimbal manager. For instance, the gimbal component itself can and should broadcast its status with GIMBAL_ATTITUDE_STATUS to everyone. Vice versa, the parameters of the gimbal would be requested from the gimbal component directly and not through the gimbal manager. 

* The Two Commandments need to be complemented by a set of managing messages and flags in order to e.g. allow components to determine who is the gimbal master, retrieve information from the gimbal master, allow us to handle multiple gimbals or even dynamic changes of the gimbal master, allow nudging and overriding, and so on. These are however not part of the Commandments, since a gimbal master implementation may or may not use them. This complementary managing message set can thus be described later in the below.
  
* The location of the various pieces of software (component, gimbal driver, gimbal manager) on the various devices (autopilot, companion, gimbal, GCS, ...) can be quite flexible ("located" means that the piece of software is running on the piece of hardware where it is located). 


In order to enlighten the concept, two examples shall be discussed:
   
(A) Non-Mavlink gimbal connected to the autopilot.

- gimbal driver: located on the autopilot
- gimbal component: not existing
- gimbal manager: located on the autopilot
   
In this arrangement a gimbal component is not existing, since the autopilot communicates with the gimbal via the gimbal driver. Also, gimbal messages would never be exchanged, as all communication would be between the gimbal manager and the extraneous components, i.e., use messages from the gimbal manager set.

(B) MAVLink gimbal connected to the autopilot.

- gimbal driver: not existing since only needed for non-Mavlink gimbals
- gimbal component: located on the gimbal device
- gimbal manager: located on the autopilot

(C) "Intelligent" MAVLink gimbal connected to the autopilot.

- gimbal driver: not existing since only needed for non-Mavlink gimbals
- gimbal component: located on the gimbal device
- gimbal manager: located on the gimbal device

In this arrangement the gimbal runs both the gimbal component and gimbal manager software. Messages of the gimbal set would thus never be exchanged, as all communication would be between the gimbal manager and the extraneous components, i.e., use messages from the gimbal manager set.That is, the gimbal presents itself to the outside with the more user-friendly API of the gimbal manager message set. 
   


Multiple Gimbals
=================
       
In order to support multiple gimbal devices there are a couple of measures:

* Multiple gimbal component IDs for gimbal devices. It would be tempting to use the default MAVLink gimbal component IDs for that purpose, but this can't work since a gimbal may or may not have a MAVLink gimbal component ID, and also non-MAVLink gimbals need an ID.

* For each gimbal device there is one associated gimbal manager. A gimbal manager implementation may be able to handle several gimbals, but it then has to present itself to the outside as a gimbal manager for each gimbal.

* Gimbal managers announce which gimbal device they map to.

* Where needed, messages have a param to indicate the gimbal ID (or 0 to signal all)