vSphere Input does not collect datastore metrics #4789

photinus · 2018-10-02T20:42:54Z

Relevant telegraf.conf:

[agent]
interval = "10s"
round_interval = true
metric_buffer_limit = 1000
flush_buffer_when_full = true
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
debug = true
quiet = false
logfile = "/Program Files/Telegraf/telegraf.log"

hostname = ""
[[outputs.influxdb]]
urls = ["udp://10.120.1.44:8089"]

[[inputs.vsphere]]
vcenters = [ "https://vsphere.address.here/sdk" ]
username = "vsphereUSer"
password = "SuperSecretVspherePasswordOfGreatness"

datastore_metric_include = [ "*" ]

vm_metric_exclude = [ "*" ]

host_metric_include = [
"cpu.coreUtilization.average",
"cpu.costop.summation",
"cpu.demand.average",
"cpu.idle.summation",
"cpu.latency.average",
"cpu.readiness.average",
"cpu.ready.summation",
"cpu.swapwait.summation",
"cpu.usage.average",
"cpu.usagemhz.average",
"cpu.used.summation",
"cpu.utilization.average",
"cpu.wait.summation",
"mem.active.average",
"mem.latency.average",
"mem.state.latest",
"mem.swapin.average",
"mem.swapinRate.average",
"mem.swapout.average",
"mem.swapoutRate.average",
"mem.totalCapacity.average",
"mem.usage.average",
"mem.vmmemctl.average",
"net.bytesRx.average",
"net.bytesTx.average",
"net.droppedRx.summation",
"net.droppedTx.summation",
"net.errorsRx.summation",
"net.errorsTx.summation",
"net.usage.average",
"power.power.average",
"sys.uptime.latest",
]
host_metric_exclude = [] ## Nothing excluded by default
host_instances = true ## true by default

cluster_metric_exclude = [""] ## Nothing excluded by default
cluster_instances = true ## true by default
datacenter_metric_exclude = [ "" ] ## Datacenters are not collected by default.

collect_concurrency = 4
discover_concurrency = 2

timeout = "20s"

insecure_skip_verify = true

System info:

Telegraf 1.8.0
Windows Server 2012 r2
vSphere Appliance 6.5 u1d

Steps to reproduce:

Configured Telegraf to collect just host and datastore metrics
Telegraf writes metrics for hosts, but no metrics for datastores

Expected behavior:

Datastore metrics are written to influxdb

Actual behavior:

No datastore metrics are written to influxdb

Additional info:

Logs:
2018-10-02T20:40:04Z D! Attempting connection to output: influxdb
2018-10-02T20:40:04Z D! Successfully connected to output: influxdb
2018-10-02T20:40:04Z I! Starting Telegraf 1.8.0
2018-10-02T20:40:04Z I! Loaded inputs: inputs.vsphere
2018-10-02T20:40:04Z I! Loaded aggregators:
2018-10-02T20:40:04Z I! Loaded processors:
2018-10-02T20:40:04Z I! Loaded outputs: influxdb
2018-10-02T20:40:04Z I! Tags enabled: host=telegraf
2018-10-02T20:40:04Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"telegraf", Flush Interval:10s
2018-10-02T20:40:10Z D! [input.vsphere]: Starting plugin
2018-10-02T20:40:10Z D! [input.vsphere]: Creating client: vsphere.address.here
2018-10-02T20:40:10Z D! [input.vsphere]: Start of sample period deemed to be 2018-10-02 13:35:10.1675805 -0700 PDT m=-292.934584699
2018-10-02T20:40:10Z D! [input.vsphere]: Collecting metrics for 0 objects of type datastore for vsphere.address.here
2018-10-02T20:40:10Z D! [input.vsphere]: Discover new objects for vsphere.address.here
2018-10-02T20:40:10Z D! [input.vsphere] Discovering resources for datacenter
2018-10-02T20:40:10Z D! [input.vsphere]: No parent found for Folder:group-d1 (ascending from Folder:group-d1)
2018-10-02T20:40:10Z D! [input.vsphere] Discovering resources for cluster
2018-10-02T20:40:10Z D! [input.vsphere] Discovering resources for host
2018-10-02T20:40:11Z D! [input.vsphere] Discovering resources for vm
2018-10-02T20:40:11Z D! [input.vsphere] Discovering resources for datastore
2018-10-02T20:40:20Z D! Output [influxdb] buffer fullness: 0 / 1000 metrics.
2018-10-02T20:40:20Z D! [input.vsphere]: Latest: 2018-10-02 13:40:10.1675805 -0700 PDT m=+7.065415301, elapsed: 14.846599, resource: datastore
2018-10-02T20:40:20Z D! [input.vsphere]: Sampling period for datastore of 300 has not elapsed for vsphere.address.here

rsasportes · 2018-10-03T08:38:57Z

I have the same exact issue here.
My log shows mostly the same result as above.

Environment : VCenter 6.7 and Vsphere 6.7
Storage : Datastore located on NFS

prydin · 2018-10-03T17:24:36Z

Did you let it run for a while? It misses the first collection because background object discovery hasn't finished yet. You can force the collector to wait for the first round of discovery by setting this flag in the config:

force_discover_on_init = true

Let me know if this solved your problem.

rsasportes · 2018-10-04T06:05:59Z

Hi,

I've now applied your trick, and my telegraf.log is now a little more explicit about what's happening. Here it is :

018-10-04T06:01:47Z D! [input.vsphere]: Start of sample period deemed to be 2018-10-04 05:56:47.096487102 +0000 UTC m=-279.309774333
2018-10-04T06:01:47Z D! [input.vsphere]: Collecting metrics for 40 objects of type datastore for
2018-10-04T06:01:47Z D! [input.vsphere]: Querying 37 objects, 256 metrics (3 remaining) of type datastore for 02-sys-v278.chjltn.local. Processed objects: 37. Total objects 40
2018-10-04T06:01:47Z D! Output [influxdb] wrote batch of 1000 metrics in 42.794393ms
2018-10-04T06:01:50Z D! Output [influxdb] buffer fullness: 159 / 10000 metrics.
2018-10-04T06:01:50Z D! Output [influxdb] wrote batch of 159 metrics in 21.619734ms
2018-10-04T06:01:55Z E! Error in plugin [inputs.vsphere]: took longer to collect than collection interval (10s)
2018-10-04T06:02:00Z D! Output [influxdb] buffer fullness: 19 / 10000 metrics.
2018-10-04T06:02:00Z D! Output [influxdb] wrote batch of 19 metrics in 7.921797ms

rsasportes · 2018-10-04T06:50:49Z

I have extended the interval between collections. and now I have another set of errors :

2018-10-04T06:48:16Z D! [input.vsphere]: Start of sample period deemed to be 2018-10-04 06:43:16.837579961 +0000 UTC m=-178.066128366
2018-10-04T06:48:16Z D! [input.vsphere]: Collecting metrics for 40 objects of type datastore for VCENTER
2018-10-04T06:48:16Z D! [input.vsphere]: Querying 37 objects, 256 metrics (3 remaining) of type datastore for VCENTER. Processed objects: 37. Total objects 40
2018-10-04T06:48:20Z D! Output [influxdb] buffer fullness: 565 / 10000 metrics.
2018-10-04T06:48:20Z D! Output [influxdb] wrote batch of 565 metrics in 34.680264ms
2018-10-04T06:48:30Z D! Output [influxdb] buffer fullness: 0 / 10000 metrics.
2018-10-04T06:48:36Z D! [input.vsphere]: Query returned 0 metrics

prydin · 2018-10-04T16:37:45Z

@rsasportes I don't see any errors in that log. Just debug messages. Am I missing something?

prydin · 2018-10-04T16:42:36Z

Oh! Now I see the problem! The query isn't returning any data. Which version of Telegraf are you on? 1.8 has a bug in it that can cause queries to return 0 objects of the time on the node where Telegraf runs is ahead of vCenter. This is fixed in 1.8.1 that was just released.

rsasportes · 2018-10-05T07:59:08Z

Hi,

First, thanks for your help ;-)

I've upgraded Telegraf to 1.8.1_1, and now some datastores start to appear in the dashboard.
Unfortunately, it is only a small amount of them (3 over 40).

As it says in the log :

Latest: 2018-10-05 07:45:17.898618 +0000 UTC, elapsed: 304.955344, resource: datastore
2018-10-05T07:50:17Z D! [input.vsphere]: Start of sample period deemed to be 2018-10-05 07:45:17.898618 +0000 UTC
2018-10-05T07:50:17Z D! [input.vsphere]: Collecting metrics for 40 objects of type datastore for VCENTER
2018-10-05T07:50:17Z D! [input.vsphere]: Querying 37 objects, 256 metrics (3 remaining) of type datastore for VCENTER. Processed objects: 37. Total objects 40
2018-10-05T07:50:17Z D! Output [influxdb] wrote batch of 1000 metrics in 42.750677ms
2018-10-05T07:50:20Z D! Output [influxdb] buffer fullness: 165 / 10000 metrics.
2018-10-05T07:50:20Z D! Output [influxdb] wrote batch of 165 metrics in 19.038192ms
2018-10-05T07:50:22Z D! [input.vsphere] Discovering resources for datastore
2018-10-05T07:50:30Z D! Output [influxdb] buffer fullness: 0 / 10000 metrics.
2018-10-05T07:50:37Z D! [input.vsphere]: Query returned 0 metrics
2018-10-05T07:50:39Z D! [input.vsphere]: Query returned 0 metrics

I've extended data collection interval to 5 minutes, just to check if there is a timeout. No luck.

Do you think it might be possible to manually launch data collection, alongside verbose logging, and trace what's wrong ?

Again, thanks !

prydin · 2018-10-05T16:19:16Z

Do you see any metrics at all? Do you see a complete set of metrics for some datastores or do you only see sporadic metrics for random datastores? Also, for anything that's missing, can you go to the vCenter UI and make sure you can see those metrics under Monitoring->Performance?

Muellerflo · 2018-10-08T11:18:10Z

Hi,
i have the same issue and only see sporadic metrics in the grafana dashboard.
I also want to exclude some metrics (all local esxi disks which named "hypervisorxxx-local")
datastore_metric_exclude = ["*-local"]
But the plugin still collect metrics for 98 datastores ;)

Thanks.

prydin · 2018-10-08T19:08:38Z

@Muellerflo the includes and excludes act on metric names, not object names. The ability to filter objects will be added soon. See #4790

As for sporadic metrics, have you checked the log if you get any timeouts/collections that take longer than the interval?

ion-storm · 2018-10-12T02:53:22Z

We have the same issue, it appears to happen with larger vcenters, one vcenter with only 5 datastores worked pefectly, but the other with dozens of datastores the data collection failed. Any ETA on a fix? This appears to be effecting many others as well.

prydin · 2018-10-12T03:58:27Z

@ion-storm anything in the logs?

prydin · 2018-10-12T12:56:29Z

@ion-storm There are several reasons why datastore metrics could be missing. What is your collection interval? Have you tried to declare the plugin separately for the datastores with a longer collection interval?

ybinnenwegin2ip · 2018-10-17T19:29:55Z

Hello,

We also seem to be experiencing issues with receiving certain data data from certain datastores.

The data that Telegraf stores into the measurement "vsphere_datastore_datastore" does not seem to appear for certain datastores. The data in the measurement "vsphere_datastore_disk" however, does.

So, regarding our setup, we have 25 datastores in total, of which 19 have a type of form "VMFS", the other 6 have a type of "NFS 3". The NFS ones are the ones that don't show up in the "vsphere_datastore_datastore" measurement (in InfluxDB).

When running Telegraf I turned on debug logging and it does discover 25 datastores:

2018-10-17T18:45:34Z D! [input.vsphere]: Collecting metrics for 25 objects of type datastore for VCENTER_INSTANCE

I messed around a bit more in both Telegraf and govmomi and while printing the data that is processed in govmomi's ToMetricSeries function (which seems to fetch numberReadAveraged & numberWriteAveraged, which are stored in "vsphere_datastore_datastore") I only saw the VMFS datastores coming by, none of the NFS datastores.

I haven't dug any deeper yet, but hopefully this will help someone along their way. :)

If anyone wants me to try things out, I'm in the CEST timezone.

prydin · 2018-10-17T19:49:28Z

@ybinnenwegin2ip What's the statistics level on your vCenter? I believe you have to be at least at level 3 for those metrics to be collected.

You could also try to check the metrics using the govc tool. Something like this:

govc metric.sample -n 10 /DC/datastore/myds datastore.numberWriteAveraged.average

If that doesn't return any metrics, you're simply not collecting them on your vCenter and you'd have to increase the statistics level for the 5 minute buckets.

ybinnenwegin2ip · 2018-10-17T19:52:49Z

@prydin

Thanks for your quick response!

I gave it a shot and this is all I get back:

$ ./govc_linux_amd64 metric.sample /DC_NAME/datastore/DATASTORE_NAME datastore.numberReadAveraged.average
DATASTORE_NAME  -  datastore.numberWriteAveraged.average      num

I'll look into the statistics level, thanks for the pointer!

EDIT:

I noticed your edit, I think you added the -n 10?

Either way, I ran it again, didn't change:

$ ./govc_linux_amd64 metric.sample -n 10 /DC_NAME/datastore/DATASTORE_NAME datastore.numberReadAveraged.average
DATASTORE_NAME  -  datastore.numberReadAveraged.average      num

I've also just increased the statistics level from "1" to "2", let's see what happens. :)

prydin · 2018-10-17T20:40:52Z

I think you need at least 3. Just verified in my lab. If I drop it lower than 3, the metric disappears.

ybinnenwegin2ip · 2018-10-17T20:52:34Z

I think you need at least 3. Just verified in my lab. If I drop it lower than 3, the metric disappears.

Thanks! It's set to 2 now and while I do see some statistics appearing (read_average & write_average) the others are indeed still missing. Perhaps I misunderstood the VMware documentation :)

Level 2

Disk – All metrics, excluding numberRead and numberWrite.

(https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.monitoring.doc/GUID-25800DE4-68E5-41CC-82D9-8811E27924BC.html)

I guess a networked datastore doesn't count as a disk but rather a 'device' then?

Either way, I'll set it to 3 soon and I'll report back in here. Thanks a lot for pointing me in the right direction!

prydin · 2018-10-17T20:56:48Z

Happy to help. You're not the only one who's confused about the documentation around this. :)

jvigna · 2018-10-25T10:23:54Z

Hi,

I've exactly the same issue on my vcenter NO datastore metrics are collected. I use 1.8.2 version of telegraf. Also activating the debug option gives no hint why no metrics are collected. I see this lines maybe it's a hint:

2018-10-25T09:51:34Z D! [input.vsphere]: Start of sample period deemed to be 2018-10-25 09:46:34.829154 +0000 UTC
2018-10-25T09:51:34Z D! [input.vsphere]: Collecting metrics for 104 objects of type datastore for vcenter
2018-10-25T09:51:34Z D! [input.vsphere]: Querying 37 objects, 256 metrics (3 remaining) of type datastore for vcenter. Processed objects: 37. Total objects 104
2018-10-25T09:51:34Z D! [input.vsphere]: Querying 38 objects, 256 metrics (6 remaining) of type datastore for vcenter. Processed objects: 74. Total objects 104
2018-10-25T09:51:54Z D! [input.vsphere]: Query returned 0 metrics
2018-10-25T09:51:54Z D! [input.vsphere]: Query returned 0 metrics
2018-10-25T09:51:54Z D! [input.vsphere]: Query returned 0 metrics

How could I debug this better?

This is the output of the govc command:

./govc_linux_amd64 metric.sample -n 10 itasz7_01 datastore.numberReadAveraged.average

itasz7_01 - datastore.numberReadAveraged.average num
itasz7_01 naa.6000144000000010f00d71fe54eeb02a datastore.numberReadAveraged.average num
itasz7_01 - datastore.numberReadAveraged.average num
itasz7_01 naa.6000144000000010f00d71fe54eeb02a datastore.numberReadAveraged.average num
itasz7_01 - datastore.numberReadAveraged.average num
itasz7_01 naa.6000144000000010f00d71fe54eeb02a datastore.numberReadAveraged.average num
itasz7_01 - datastore.numberReadAveraged.average num
itasz7_01 naa.6000144000000010f00d71fe54eeb02a datastore.numberReadAveraged.average num

prydin · 2018-10-25T13:07:10Z

@jvigna It looks like Govc isn't returning any metrics either. Have you tried decreasing the statistics level for 5 minute samples in vCenter?

jvigna · 2018-10-25T15:22:07Z

It's strange I now inserted 2 more vcenters (smaller ones) and they work without problems. I don't think it is a setting on vcenter site, could it be that there are too much datastores?

jvigna · 2018-10-25T15:32:54Z

BTW.: In first I'm interested in the disk infos as:

./govc_linux_amd64 metric.sample itasz7_01 disk.capacity.latest
itasz7_01 - disk.capacity.latest 4294705152 KB
itasz7_01 - disk.capacity.latest 4294705152 KB
itasz7_01 - disk.capacity.latest 4294705152 KB
itasz7_01 - disk.capacity.latest 4294705152 KB

And they seem to work.

prydin · 2018-10-25T15:35:24Z

What does your config file look like? Also, please check the stats levels on both vcenters to see if there's a difference.

jvigna · 2018-10-26T10:03:56Z

Hi the stats level are the same on the 3 vcenter servers and my config is this:

[[inputs.vsphere]]
vcenters = [ "https://vcenter/sdk" ]
username = "user@domain"
password = "password"
vm_metric_include = [
"cpu.demand.average",
"cpu.idle.summation",
"cpu.latency.average",
"cpu.readiness.average",
"cpu.ready.summation",
"cpu.run.summation",
"cpu.usagemhz.average",
"cpu.used.summation",
"cpu.wait.summation",
"mem.active.average",
"mem.granted.average",
"mem.latency.average",
"mem.swapin.average",
"mem.swapinRate.average",
"mem.swapout.average",
"mem.swapoutRate.average",
"mem.usage.average",
"mem.vmmemctl.average",
"net.bytesRx.average",
"net.bytesTx.average",
"net.droppedRx.summation",
"net.droppedTx.summation",
"net.usage.average",
"power.power.average",
"virtualDisk.numberReadAveraged.average",
"virtualDisk.numberWriteAveraged.average",
"virtualDisk.read.average",
"virtualDisk.readOIO.latest",
"virtualDisk.throughput.usage.average",
"virtualDisk.totalReadLatency.average",
"virtualDisk.totalWriteLatency.average",
"virtualDisk.write.average",
"virtualDisk.writeOIO.latest",
"sys.uptime.latest",
]
host_metric_include = [
"cpu.coreUtilization.average",
"cpu.costop.summation",
"cpu.demand.average",
"cpu.idle.summation",
"cpu.latency.average",
"cpu.readiness.average",
"cpu.ready.summation",
"cpu.swapwait.summation",
"cpu.usage.average",
"cpu.usagemhz.average",
"cpu.used.summation",
"cpu.utilization.average",
"cpu.wait.summation",
"disk.deviceReadLatency.average",
"disk.deviceWriteLatency.average",
"disk.kernelReadLatency.average",
"disk.kernelWriteLatency.average",
"disk.numberReadAveraged.average",
"disk.numberWriteAveraged.average",
"disk.read.average",
"disk.totalReadLatency.average",
"disk.totalWriteLatency.average",
"disk.write.average",
"mem.active.average",
"mem.latency.average",
"mem.state.latest",
"mem.swapin.average",
"mem.swapinRate.average",
"mem.swapout.average",
"mem.swapoutRate.average",
"mem.totalCapacity.average",
"mem.usage.average",
"mem.vmmemctl.average",
"net.bytesRx.average",
"net.bytesTx.average",
"net.droppedRx.summation",
"net.droppedTx.summation",
"net.errorsRx.summation",
"net.errorsTx.summation",
"net.usage.average",
"power.power.average",
"storageAdapter.numberReadAveraged.average",
"storageAdapter.numberWriteAveraged.average",
"storageAdapter.read.average",
"storageAdapter.write.average",
"sys.uptime.latest",
]
cluster_metric_include = [] ## if omitted or empty, all metrics are collected
datastore_metric_include = [] ## if omitted or empty, all metrics are collected
datacenter_metric_include = [] ## if omitted or empty, all metrics are collected
insecure_skip_verify = true

prydin · 2018-10-26T12:53:41Z

@jvigna You're collecting all metrics for datastores. That can take a long time. What's your collection interval?

I think what's happening is that the collection takes longer than the collection interval. You have two options:

Reduce the number of datastore metrics you're collecting using datastore_metric_include
Create two instances if the input.vsphere plugin. One with e.g. 20s interval for hosts and vms and one with 300s for datastores and clusters. Those resources only report metrics every 300s anyway, so you don't lose anything (other than making the config file slightly more complex)

jvigna · 2018-10-26T13:28:40Z

I will try this, but just for the record shouldn't I get some sort of warning if "really" such a timeout is is hit?

BTW: My collection interval is already 30s as with 10s i GOT that warning. And I already have because of this an own instance of telegraf for just collecting the vsphere metrics.

prydin · 2018-10-26T13:34:08Z

Yes, there should be errors in the logfile if this is the issue.

Can you try to set datastore_metric_include to just the metric you're interested in, e.g.

datastore_metric_include = [ "disk.capacity.latest" ]

Could you please try that and tell me what the result is and paste a logfile if it doesn't work?

jvigna · 2018-10-26T14:02:29Z

Ok I think that could be a good idea, what do I need for capacity? disk.capacity.latest and disk.used.latest? How may I get a list of the metrics?

prydin · 2018-10-26T14:16:27Z

Those two are good candidates. To list all metrics available, use the following govc command:

govc metric.ls tasz7_01

jvigna · 2018-10-26T15:13:51Z

As soon as I'm able to modify the configuration I'll let you know if it works when sending only a few metrics.
Thanks!

prydin · 2018-10-29T12:40:09Z

Keep in mind that it may take up to 30 minutes to see any data on storage capacity, since these are only generated at a 30 minute interval by vCenter.

Compboy100 · 2018-10-29T14:42:22Z

Hi @prydin I've been using the plugin with a few day's now and have also had this issue.

I've tried to create 2 instance like this.
[[inputs.vsphere]]
interval = "301s"
datastore_metric_include = []
force_discover_on_init = true

[[inputs.vsphere]]
interval = "301s"
datastore_metric_include = []
force_discover_on_init = true
datastore_metric_include = [ "disk.capacity.latest", "disk.used.latest", "disk.provisioned.latest", ]
datacenter_metric_include = []
max_query_objects = 64
max_query_metrics = 64
collect_concurrency = 3
discover_concurrency = 3
force_discover_on_init = false
object_discovery_interval = "300s"
timeout = "301s"

I finally got some data showing but only if i choose 7day's or more. everyting below that does not show content.

prydin · 2018-10-29T14:46:52Z

@Compboy100 Which version of the plugin?

Compboy100 · 2018-10-29T14:51:46Z

Hi thank you for the quick reply.

2018-10-29T14:46:15Z I! Starting Telegraf 1.8.2
sometimes i see this also in the logs.
2018-10-29T14:47:00Z D! [input.vsphere]: Collecting metrics for 0 objects of type datastore
2018-10-29T14:48:00Z D! [input.vsphere]: Sampling period for datastore of 300 has not elapsed for host

Eventhou i have the interval at 301.
Like mentioned datastores are now at least populated. above 7day's interval.
dashboard used from Mr De La Cruz https://jorgedelacruz.uk/2018/10/01/looking-for-the-perfect-dashboard-influxdb-telegraf-and-grafana-part-xii-native-telegraf-plugin-for-vsphere/

prydin · 2018-10-29T15:01:53Z

I think I've tracked this down to minor clock skew between vCenter and the ESXi hosts. Working on a fix.

prydin · 2018-11-01T13:21:01Z

I've been working on this over the last few days and addressed multiple issues:

vCenter is sometimes (very) late posting metrics, especially when running under high load. Metrics can be as much as 15 minutes delayed. We've addressed this by applying a "lookback", i.e. fetching a few sample periods back every time we query metrics. Surprisingly, this doesn't seem to have a significant performance impact and solves this issue.
Data collections could time out without an error message (regression). I have suspicion that some of the issues reported may have to be caused by that.
vCenter 6.5 seems to over-estimate the size of a query for cluster metrics and reject it. Solved this by decreasing the query batch size for cluster queries.

prydin · 2018-11-01T13:29:58Z

Anyone who wants to be a beta tester for the fix? It's available here:

https://github.com/prydin/telegraf/releases/tag/prydin-4789

Compboy100 · 2018-11-01T18:19:16Z

Thank you will try it,

Can I just extract the vsphere plugins folder to my linux distro? or do i have to compile the whole thing.

prydin · 2018-11-01T18:31:16Z

It's binaries. Nothing to compile.

danielnelson · 2018-11-01T18:36:58Z

@prydin Feel free to make a PR too, this will build all the packages on CircleCI and I can add links to the artifacts. Just note in the PR that its still preliminary.

prydin · 2018-11-01T18:40:05Z

@danielnelson Will do. On the go today, but I'll get a PR filed as some as I get back to home base.

Compboy100 · 2018-11-01T20:20:12Z

No luck getting data below 24h yet. will report more tomorrow when back in office.
2018-11-02T13:32:00Z D! [input.vsphere]: Starting plugin
2018-11-02T13:32:00Z D! [input.vsphere]: Running initial discovery and waiting for it to finish
2018-11-02T13:32:00Z D! [input.vsphere]: Discover new objects for
2018-11-02T13:32:00Z D! [input.vsphere] Discovering resources for host
2018-11-02T13:32:00Z D! [input.vsphere] Discovering resources for vm
2018-11-02T13:32:01Z D! [input.vsphere] Discovering resources for datastore
2018-11-02T13:32:04Z D! [input.vsphere] Discovering resources for datacenter
2018-11-02T13:32:05Z D! [input.vsphere]: Collecting metrics for 0 objects of type datastore for
2018-11-02T13:37:11Z E! Error in plugin [inputs.vsphere]: ServerFaultCode: This operation is restricted by the administrator - 'vpxd.stats.maxQueryMetrics'. Contact your system administrator

2018-11-02T13:42:07Z D! [input.vsphere] Discovering resources for datastore
2018-11-02T13:43:05Z D! [input.vsphere]: Latest: 2018-11-02 13:37:18.360235 +0000 UTC, elapsed: 364.880925, resource: datastore
2018-11-02T13:43:05Z D! [input.vsphere]: Collecting metrics for 16 objects of type datastore for
2018-11-02T13:43:05Z D! [input.vsphere]: Queuing query: 16 objects, 48 metrics (0 remaining) of type datastore for . Total objects 16 (final chunk)
2018-11-02T13:43:05Z D! [input.vsphere] Query for datastore returned metrics for 16 objects
2018-11-02T13:43:05Z D! [input.vsphere] CollectChunk for datastore returned 48 metrics
2018-11-02T13:44:05Z D! [input.vsphere]: Latest: 2018-11-02 13:43:18.24116 +0000 UTC, elapsed: 65.203200, resource: datastore
2018-11-02T13:44:05Z D! [input.vsphere]: Sampling period for datastore of 300 has not elapsed on
2018-11-02T13:45:05Z D! [input.vsphere]: Latest: 2018-11-02 13:43:18.24116 +0000 UTC, elapsed: 125.226100, resource: datastore
2018-11-02T13:45:05Z D! [input.vsphere]: Sampling period for datastore of 300 has not elapsed on

prydin · 2018-11-02T13:57:04Z

Ah! That one is easy to fix. I'm assuming you're running an older version of vCenter? Go ahead and set max_query_metrics to 64.

Like this:
max_query_metrics = 64

If that doesn't work, try decreasing it to 20.

Compboy100 · 2018-11-02T14:28:25Z

Yes 6.0. Had already put them to 64.
Decreased them now to 40. Will change to 20 if problem presist.

Compboy100 · 2018-11-02T15:02:47Z

Can confirm that data is being recorded for below 24h now.
Thank you @prydin

Compboy100 · 2018-11-06T11:37:27Z

@prydin Don't forget the PR. there is a RC1 live. does it include the changes?
From my side I am getting better stats now with the test release.
There are sill some minor issues

But at least i'm getting data now.

prydin · 2018-11-06T13:07:37Z

I've been wanting this run in my lab for a while first, but I just opened a PR. @glinton and @danielnelson is there still a chance to get this into 1.9?

prydin · 2018-11-06T13:08:40Z

@Compboy100 no, RC1 doesn't have the changes. I wanted to make sure it ran OK in my lab first.

danielnelson · 2018-11-06T20:00:40Z

It's a possibility, if not for 1.9.0 then it should be possible to get this in for 1.9.1. Let's focus on getting it added to master first, I'll review today.

danielnelson · 2018-11-06T22:27:23Z

Closed in #4968, @Compboy100 I'm creating a new release candidate this afternoon.

danielnelson · 2018-11-06T22:52:20Z

https://github.com/influxdata/telegraf/releases/tag/1.9.0-rc2

danielnelson added bug unexpected problem or unintended behavior need more info area/vsphere labels Oct 3, 2018

prydin mentioned this issue Nov 6, 2018

vSphere plugin issue 4789 (datastore metrics missing) #4968

Merged

3 tasks

danielnelson closed this as completed Nov 6, 2018

prydin mentioned this issue Dec 19, 2018

[inputs.vsphere] Resolved issue 4790 (Resource whitelisting) #5165

Merged

3 tasks

vSphere Input does not collect datastore metrics #4789

vSphere Input does not collect datastore metrics #4789

Comments

photinus commented Oct 2, 2018

Relevant telegraf.conf:

System info:

Steps to reproduce:

Expected behavior:

Actual behavior:

Additional info:

rsasportes commented Oct 3, 2018 • edited Loading

prydin commented Oct 3, 2018 • edited Loading

rsasportes commented Oct 4, 2018 • edited Loading

rsasportes commented Oct 4, 2018

prydin commented Oct 4, 2018

prydin commented Oct 4, 2018

rsasportes commented Oct 5, 2018

prydin commented Oct 5, 2018

Muellerflo commented Oct 8, 2018

prydin commented Oct 8, 2018

ion-storm commented Oct 12, 2018 • edited Loading

prydin commented Oct 12, 2018

prydin commented Oct 12, 2018 • edited Loading

ybinnenwegin2ip commented Oct 17, 2018

prydin commented Oct 17, 2018 • edited Loading

ybinnenwegin2ip commented Oct 17, 2018 • edited Loading

prydin commented Oct 17, 2018

ybinnenwegin2ip commented Oct 17, 2018

prydin commented Oct 17, 2018

jvigna commented Oct 25, 2018 • edited Loading

./govc_linux_amd64 metric.sample -n 10 itasz7_01 datastore.numberReadAveraged.average

prydin commented Oct 25, 2018

jvigna commented Oct 25, 2018

jvigna commented Oct 25, 2018

prydin commented Oct 25, 2018

jvigna commented Oct 26, 2018

prydin commented Oct 26, 2018

jvigna commented Oct 26, 2018 • edited Loading

prydin commented Oct 26, 2018

jvigna commented Oct 26, 2018

prydin commented Oct 26, 2018

jvigna commented Oct 26, 2018

prydin commented Oct 29, 2018

Compboy100 commented Oct 29, 2018

prydin commented Oct 29, 2018

Compboy100 commented Oct 29, 2018

prydin commented Oct 29, 2018

prydin commented Nov 1, 2018

prydin commented Nov 1, 2018

Compboy100 commented Nov 1, 2018 • edited Loading

prydin commented Nov 1, 2018

danielnelson commented Nov 1, 2018

prydin commented Nov 1, 2018

Compboy100 commented Nov 1, 2018 • edited Loading

prydin commented Nov 2, 2018

Compboy100 commented Nov 2, 2018

Compboy100 commented Nov 2, 2018

Compboy100 commented Nov 6, 2018

prydin commented Nov 6, 2018

prydin commented Nov 6, 2018

danielnelson commented Nov 6, 2018

danielnelson commented Nov 6, 2018

danielnelson commented Nov 6, 2018

rsasportes commented Oct 3, 2018 •

edited

Loading

prydin commented Oct 3, 2018 •

edited

Loading

rsasportes commented Oct 4, 2018 •

edited

Loading

ion-storm commented Oct 12, 2018 •

edited

Loading

prydin commented Oct 12, 2018 •

edited

Loading

prydin commented Oct 17, 2018 •

edited

Loading

ybinnenwegin2ip commented Oct 17, 2018 •

edited

Loading

jvigna commented Oct 25, 2018 •

edited

Loading

jvigna commented Oct 26, 2018 •

edited

Loading

Compboy100 commented Nov 1, 2018 •

edited

Loading

Compboy100 commented Nov 1, 2018 •

edited

Loading