April 22, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Microsoft Azure: Remember Exchange Server?

Recently I joked there’s significant difference between AWS and Azure launching features:

  • AWS launches a production-ready feature that you can consume the next day.
  • Azure launches a preview that might work in 6 months.

Those with long enough memories shouldn’t be surprised. It’s not the first time Microsoft is using the same tactics.

April 22, 2021 06:53 AM

April 21, 2021

My Etherealmind
Ethan Banks on Technology

If You Haven’t Checked Your Backups, They Probably Aren’t Working

This is a pleasant reminder to check your backups. I don’t mean, “Hey, did the backup run last night? Yes? Then all is well.” That’s slightly better than nothing, but not really what you’re checking for. Instead, you’re determining your ability to return a system to a known state by verifying your backups regularly.

Backups are a key part of disaster recovery, where modern disasters include ransomware, catastrophic public cloud failures, and asset exposure by accidental secrets posting.

For folks in IT operations such as network engineers, systems to be concerned about include network devices such as routers, switches, firewalls, load balancers, and VPN concentrators. Public cloud network artifacts also matter. Automation systems matter, too. And don’t forget about special systems like policy engines, SDN controllers, wifi controllers, network monitoring, AAA, and…you get the idea.

Don’t confuse resiliency for backup.

When I talk about backups, I’m talking about having known good copies of crucial data that exist independently of the systems they normally live on.

  • Distributed storage is not backup.
  • A cluster is not backup.
  • An active/active application delivery system spread over geographically diverse data centers is not backup.

The points above are examples of distributed computing. Distributed computing is not about recovering from a system-wide failure or fundamental data corruption. Distributed computing is about maintaining application availability in the face of elastic demand and component failure. This is not backup.

Are you backing up the correct things?

Systems change over time. What you configured the backup software to grab three years ago might not be sufficient today. Are you grabbing the correct device configurations? For all devices that matter? Including the devices you turned up just last month?

If you’re treating infrastructure as code, is version control sufficient? Have your prepared for a version control system failure? After all, that’s where your golden configs reside. What would happen if, say, your GitHub repo was wiped out?

If you run a private ops server of your own with scripts or playbooks on it, is that precious data being backed up? Or is it all in a VM on your laptop representing hours and hours of work you could never recreate if something went sideways? You could at least take a snapshot and copy it offsite now and then.

Is your retention schedule correct?

It’s nice to have last night’s backup. But what if you need last week’s backup? Or last month’s? Bad things can happen to data, and you might be backing up garbage without realizing it. Have you reviewed your retention schedule to know how many backups over what period of time you have access to?

It could also be that the retention parameters you defined back in the day are no longer valid due to policy changes. Maybe regulations impacting your company have changed. Maybe there are new SecOps guidelines for data retention that impact your backup routine.

Yet another consideration is storage space, cheap though it is, being wasted with needless ancient backups. You probably don’t need five years of dailies. It’s good to clean up once in a while, and then set the retention schedule to something that cleans up after itself so you don’t have the “five years of dailies” problem again.

Will your backup survive a natural disaster?

One reason we backup is to recover from fires, floods, hurricanes, earthquakes, solar flares, extraterrestrial attack, and the like. If all copies of your backup are in the same physical location as what you’re backing up, that’s no good. Get some copies off site. These days, that probably means a secure location in the cloud.

Note the word “secure”. Please no more crucial data in unsecured S3 buckets, okay?

Can you access the backup media?

Recovering from a system failure is not the time to realize you can’t access the location of the backup data. It’s a good fire drill to go through now and then to be sure you know two things.

  1. Where the backup data lives. You forget these things if you haven’t thought about it in months.
  2. How to access to that backup data. Maybe you’re backing up to a shared S3 bucket, but credentials have changed and no one told you. Maybe the Dropbox folder shared with your team is no longer syncing for you because your access was removed.

If it’s been months or years since you’ve thought about this, assume nothing.

Can you decrypt the backup?

If your backup is encrypted, can you decrypt it? Kind of a big deal, and not something you want to have to think about when under the gun, especially if being asked to supply a key or passphrase unexpectedly.

If you’re on vacation, can someone else recover from backup?

This point is really about your emergency runbook, because of course you have one of those. The runbook explains how to recover a variety of device types from backup. Is that runbook up to date? If you’re not entirely sure where the runbook even is, then no, it’s not up to date.

Perhaps once a quarter, it’s a good idea to review your emergency runbook. Follow the procedure as listed. Given the instructions in the book, can you successfully recover a device from backup?

When’s the last time you did a backup health check?

Backups are crucially necessary and incredibly boring all at the same time. We almost never need backups, and so they tend to fall down the task list next to “update interface descriptions to the new standard” and “write the new standard for interface descriptions”. Yet, when disaster strikes, the most important thing in the world might be recovering from that backup data.

Have you thought about backups lately? Maybe it’s time to bump them up the priority list. Oh. Backups weren’t on the priority list? I see…I see.

This post was inspired by my own impromptu backup review of a web server I operate. I discovered that one of my backup file repositories hadn’t been updated since 2018, and the retention schedule was not what I had thought it was. I say this to my shame, but that’s kind of the point. These things are easy to forget when the sun is out and everything is millions of peaches.

If you have your own backup tips or horror stories, you can share them with me @ecbanks on Twitter or via the Packet Pushers Podcast Network Slack channel. Maybe we’ll record a podcast together about it.

by Ethan Banks at April 21, 2021 12:09 PM

ipSpace.net Blog (Ivan Pepelnjak)

Response: Is Switching Latency Relevant?

Minh Ha left another extensive comment on my Is Switching Latency Relevant blog post. As is usual the case, it’s well worth reading, so I’m making sure it doesn’t stay in the small print (this time interspersed with a few comments of mine in gray boxes)

I found Cisco apparently manages to scale port-to-port latency down to 250ns for L3 switching, which is astonishing, and way less (sub 100ns) for L1 and L2.

I don’t know where FPGA fits into this ultra low-latency picture, because FPGA, compared to ASIC, is bigger, and a few times slower, due to the use of Lookup Table in place of gate arrays, and programmable interconnects.

April 21, 2021 08:20 AM

XKCD Comics

April 20, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Using Unequal-Cost Multipath to Cope with Leaf-and-Spine Fabric Failures

Scott submitted an interesting the comment to my Does Unequal-Cost Multipath (UCMP) Make Sense blog post:

How about even Large CLOS networks with the same interface capacity, but accounting for things to fail; fabric cards, links or nodes in disaggregated units. You can either UCMP or drain large parts of your network to get the most out of ECMP.

Before I managed to write a reply (sometimes it takes months while an idea is simmering somewhere in my subconscious) Jeff Tantsura pointed me to an excellent article by Erico Vanini that describes the types of asymmetries you might encounter in a leaf-and-spine fabric: an ideal starting point for this discussion.

April 20, 2021 07:15 AM

April 19, 2021

Packet Pushers

Get Smart About Cloud Networking – A Packet Pushers Livestream Event, April 22

There are lots of reasons to get educated about cloud networking. You might: Be responsible for connecting end users to numerous cloud services Have to link an application in Cloud A to services and data in Cloud B Support a hybrid application that has one foot in your DC and another in AWS, Azure, or […]

The post Get Smart About Cloud Networking – A Packet Pushers Livestream Event, April 22 appeared first on Packet Pushers.

by Drew Conry-Murray at April 19, 2021 09:15 PM

Ethan Banks on Technology

A Networking Perspective On Zero Trust Architecture (ZTA)

Zero Trust Architecture (ZTA) is a security point of view that has gathered enough momentum in 2020 and 2021 to frequently appear in marketing literature. The big idea of zero trust in network computing is roughly, “I confidently know who you are and have applied an appropriate security policy, but I still don’t trust you.”

My understanding of ZTA continues to evolve. This post represents my understanding today, with an emphasis on what ZTA means for network engineers.

How Is ZTA Different From Firewall Rules?

At first glance, zero trust sounds mostly like a firewall policy. Of course I don’t trust you. That’s why we apply all these filtering rules to the VPN tunnel, network interface, etc. Yes, but simple filtering implies a level of trust. The trust comes in the assumption that if you get through the filter, what you’re saying is trustworthy.

Zero trust does away with that assumption. For example…

  1. ZTA could mean that just because a VPN user passed a complex authentication scheme, their transactions are not assumed to be wholesome. Well done–your username and password check out, and we’ve applied a filtering policy to your tunnel. With that completed, we’re now going to monitor every single thing you say to make sure you don’t try anything funny.
  2. ZTA could also mean that despite effective microsegmentation, communications across allowed ports are not trusted. You modeled the east-west data center traffic really well. The resulting microsegmentation policy is thorough. That’s nice. Now we’re going to scrutinize what each host is talking about inside of these allowed conversations.

Presumption Of Breach

Zero trust assumes that every endpoint has been compromised and represents a threat. Therefore, even though an endpoint is connected to the network legitimately and allowed secure access to resources, the access requests themselves are suspect.

Let’s try a couple of metaphors…

  • A registered vehicle might be allowed on public roads, but could be carrying contraband. A security control might be to check every vehicle for contraband before allowing the vehicle to continue.
  • A human might appear healthy, but could be an asymptomatic virus carrier. A security control might be to test every human for the presence of a virus before they can mingle with other humans.

I Don’t Feel Edgy Anymore

Another complicating factor driving ZTA is that a modern network doesn’t have natural checkpoints where deep packet inspection should clearly be done. Historically, we’ve interwoven physical campuses, colocation facilities, and data centers with carefully constructed WAN links. We’d plumb in our circuits and stick a firewall between the public bit and the private bit. Or we’d erect a barrier between zones where it just made sense, such as between QA and production environments. Of course, remote workers had to come in via a specific VPN concentrator to access company computing resources, and that concentrator was probably a DPI firewall, too.

Nowadays, the perimeter simply doesn’t exist in any meaningful way. We scatter compute resources folks need to access across physical premises, data centers, colocation, and a wide variety of public cloud services. And the humans themselves requiring secure access could be and, in the pandemic age, are anywhere. Where is the perimeter with the obvious junctions at which to place traffic checkpoints?

In this edgeless context, I’ve noticed that ZTA tends to be endpoint-oriented, because that’s where enforcement often needs to happen. Think agents manipulating local host firewalls, for instance. Endpoints are not the only places where security enforcement can happen, however. For example, some zero trust products are proxies, which play architecturally well into the edgeless network concept. You can put a proxy anywhere and point a client at it. Performance when using said proxy is another question, but we don’t need to discuss the speed of light today.

Using NIST’s SP 800-207 As A Zero Trust Architecture Reference

While despairing that zero trust had become a meaningless marketing term, I discovered NIST’s SP 800-207, Zero Trust Architecture document. NIST 800-207 is now my standard to interpret vendor marketing claims about the zero trust capabilities of their products.

NIST SP 800-207 is aimed at enterprises developing a zero trust architecture. The document includes ZTA principles (the seven tenets), logical components comprising a ZTA, deployment scenarios most enterprise network architects would recognize, implementation strategies, and associated risks. The document isn’t overly long at 59 PDF pages, but most of these pages are dense.


Vendor security literature might mention both ZTA and zero trust network access (ZTNA). Based on the reading I’ve done thus far in NIST SP 800-207 and elsewhere, I believe that ZTA is opinionated about the entire computing stack, while ZTNA is focused on the network layer. Assuming I’m correct about that, technologies such as 802.1x and NAC fall under the ZTNA umbrella, while ZTNA itself falls under the larger ZTA umbrella.

The idea of “What do words mean?” comes into play here, because depending on which vendor you talk to, ZTA or perhaps just “zero trust” and ZTNA seem interchangeable. Pair that with the opinion some network engineers hold that ZTNA is just NAC, and zero trust suddenly feels like a nothingburger.

However, ZTA implies something more significant than merely ZTNA. Therefore, when evaluating products for your organization’s ZTA strategy, it’s important to find out if what the vendor means is also what you mean. Do not assume the shiny new zero trust solution is just a rebranded version of their NAC solution. That might be all there is to it, or there might be something more complex and capable going on.

In summary, my current thinking is that the terms zero trust or ZTA signal a strategy while the term ZTNA signals a product or solution that’s a tactical component of that strategy.

If you have an opinion about ZTA vs. ZTNA and whether ZTNA is just a buzzword to describe NAC, let me know on Twitter @ecbanks or DM on the Packet Pushers Slack channel. I appreciate all feedback, corrections, and alternate viewpoints from anyone in the networking community, including you nice vendor folks.

by Ethan Banks at April 19, 2021 07:57 PM

ipSpace.net Blog (Ivan Pepelnjak)

Starting Network Automation for Non-Programmers

The reader asking about infrastructure-as-code in public cloud deployments also wondered whether he has any chance at mastering on-premises network automation due to lack of programming skills.

I am starting to get concerned about not knowing automation, IaC, or any programming language. I didn’t go to college, like a lot of my peers did, and they have some background in programming.

First of all, thanks a million to everyone needs to become a programmer hipsters for thoroughly confusing people. Now for a tiny bit of reality.

April 19, 2021 06:22 AM

XKCD Comics

April 18, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Worth Reading: Get Better at Programming by Learning How Things Work

Who would have thought that you could get better at what you do by figuring out how things you use really work. I probably made that argument (about networking fundamentals) too many times; Julia Evans claims the same approach applies to programming.

April 18, 2021 07:14 AM

April 17, 2021

ipSpace.net Blog (Ivan Pepelnjak)

MUST READ: Machine Learning is a Marvelously Executed Scam

I thought I was snarky and somewhat rude (and toned down some of my blog posts on second thought), but I’m a total amateur compared to Corey Quinn. His last masterpiece – Machine Learning is a Marvelously Executed Scam – is another MUST READ.

April 17, 2021 07:06 AM

April 16, 2021

Cioara's Cisco Blog

Upgrading 3750X can take longer than you think

Many years ago I upgraded a Cisco 3750X stack to a newer version of IOS. Since the production system I was planning to upgrade had some critical systems on it, I tested the process on a stack in the lab first. At the outset I figured “no problem, this will take a few minutes to […]

The post Upgrading 3750X can take longer than you think appeared first on tekopolis.

by Adam at April 16, 2021 05:53 PM

Packet Pushers

Cisco’s Personal Insights: Good Intentions On The Road To Hell

New behavioral monitoring capabilities in Cisco Webex are being positioned as insights to improve employee well being and work-life integration, but adding layers of surveillance isn't the way to create a more humane workplace.

The post Cisco’s Personal Insights: Good Intentions On The Road To Hell appeared first on Packet Pushers.

by Drew Conry-Murray at April 16, 2021 03:17 PM

ipSpace.net Blog (Ivan Pepelnjak)

Video: Transparent Bridging Fundamentals

Years ago I wrote a series of blog posts comparing transparent bridging and IP routing, and creating How Networks Really Work materials seemed like a perfect opportunity to make that information more structured, starting with Transparent Bridging Fundamentals.

The video is available with Free ipSpace.net Subscription.

April 16, 2021 06:42 AM

The Networking Nerd

Real Life Ensues

Hey everyone! You probably noticed that I didn’t post a blog last week. Which means for the first time in over ten years I didn’t post one. The streak is done. Why? Well, real life decided to take over for a bit. I was up to my eyeballs in helping put on our BSA council Wood Badge course. I had a great time and completely lost track of time while I was there. And that means I didn’t get a chance to post something. Which is a perfect excuse to discuss why I set goals the way that I do.

Consistency Is Key

I write a lot. Between my blog here and the writing I do for Gestalt IT I do at least 2-3 posts a week. That’s on top of any briefing notes I type out or tweets I send when I have the energy to try and be funny. For someone that felt they weren’t a prolific writer in the past I can honestly say I spend a lot of time writing out things now. Which means that I have to try and keep a consistent schedule of doing things or else I will get swamped by some other projects.

I set the goal of one post a week because it’s an easy checkpoint for me. If it’s Friday and I haven’t posted anything here I know I need to do something. That’s why a large number of my posts come out on Friday. I keep a running checkpoint in my head to figure out what I want to cover and whether or not I’ve done it. When I can mark it down that I’ve done it then I can rest easy until next week.

With my Gestalt IT writing, I tend to go in batches. I try to find a couple of ideas that work for me and I plow through the posts. If I can get 3-4 done at a time it’s easy to schedule them out. For whatever reason it’s much easier to batch them on that side of the house than it is for me to work ahead on my personal blog.

If I don’t stay consistent I worry that the time I dedicate to blogging is going to be replaced by other things. It’s the same reason I feel like I need to stay on top of exercising or scheduling other meetings. Once the time that I spend taking care of something gets replaced by something else I feel like I never get that time back.

I know that doing things like that doesn’t work the way we would like it to work. Juggling writing without a firm schedule only leads to problems down the road. However, I feel like treating my blog posts like a single juggling ball being tossed up in the air over and over keeps my focus sharp. Unless something major comes along that absolutely steals my focus away I can make it work. I even thought to myself last Thursday that I needed to write something up. Alas, lack of sleep and other distractions get in the way before I could make it happen.

Writing Down the Routine

It’s important that you pencil in your routine to make it stick. Sure, after ten years I know that I need to write something each week. It’s finally ingrained in my head. But with other things, like exercise or harmonica practice or even just remembering to take out the garbage on Thursdays I need to have some way of reminding me or blocking time.

Using a reminders app or a journaling system is a great way to make that happen in your own head. Something you can refer to regularly to make sure that things are getting done. Whatever it is works just fine as long as you’re checking it and updating it regularly. Once you let that slip you’ll find yourself cursing it all because you’re halfway through a month with no updates.

Likewise, you need to make sure to block time on you calendar to take care of important things. My morning routine involves blocking time to go for a walk or a run. I also block time to write down posts and projects that are due. Putting those times on my calendar mean that I not only get notified when it’s time to start working on things but that other people can also see what I’m up to and schedule accordingly. Just be careful that you leave time to do other stuff. Also, while it’s important to use that term wisely don’t just sit there and do nothing if you’ve scheduled writing time. Write something down with the time you have. Even if it’s just a random idea or three. You never know when those half baked ideas can be leveraged to make full-blown magic!

Tom’s Take

I had every intention of writing my makeup post on Monday. Which slipped to Tuesday or Wednesday. And then I realized that real life is never going to stop. You have to make time for the important things. If that means writing something at midnight to post the next day or jogging up and down a muddy dirt road at ten minutes before midnight to ensure you close your last activity rings you have to do what needs to be done. Time isn’t going to magically appear. The gaps in your schedule will fill up. You need to be the one to decide how you’re going to use it. Let your priorities ensue in real life instead of the other way around.

by networkingnerd at April 16, 2021 03:29 AM

XKCD Comics

April 15, 2021

SNOsoft Research Team

AI Series Part 2: Social Media and the Rise of “Echo Chambers”

AI Series Part 2: Social Media and the Rise of “Echo Chambers”

AI Series Part 2 of 6

This is the second post in a series discussing AI and its impacts on modern life. In this article, we’ll explore how AI is used in social media and the ramifications of training AI while defining “success” based upon the “wrong” metrics.

Social Media Is not Free

Social media platforms that offer “free” services aren’t actually free. These companies need to make a profit and pay their staff, so all of them must have some form of revenue stream.

In most cases, this source of user revenue is to” sell” or use data about their users. For advertisers, knowing about their consumer population and being able to target their advertisements to particular individuals and groups is extremely valuable.

If an organization has limited advertising dollars, they want to put their advertisements and products in front of the people that are most likely to buy them. While some products may have “universal” appeal, others are intended for niche markets (think video games, hiking gear, maternity clothes, etc.).

Social media platforms give advertisers access to their desired target markets. By observing their users and how they interact with the advertisements and other content on the site, these platforms can make good and “educated” guesses about the products that a particular user could or would be interested in and is likely to purchase. By selling access to this data to advertisers, social media both makes a profit and acts as a matchmaker for advertisers and their desired target markets.

Defining “Success” for Social Media Platforms

Most social media platforms are paid based on the number of advertisements that they are able to present to their users. The more advertisements that a particular user views, the more profitable they are to these platforms.

Maximizing the time that a user spends on a social media platform requires the ability to measure the user’s “engagement” with the content. The more interested the user is, the more likely that they’ll spend time on the platform and make it more money.

The ways that social media platforms measure engagement has evolved over the years. Earlier, the focus was on the amount of content that a particular user clicked on. This success metric resulted in the creation of “clickbait” to lure users into continually clicking on new content and links and spending time on the platform.

However, over time users have grown increasingly tired of clicking on things that look anything like clickbait. While they may be willing to spend hours on a particular platform, they want their interactions to have some level of substance. This prompted an evolution in how these platforms defined “successful” and “engaging” content.

Giving the User What They Want

The modern goal of social media platforms is to provide users with content that they find “valuable”. The belief was that continually showing users high-value content incentivizes them to spend time on the site (and make the platform more advertising money), react, comment, share, and draw in the attention of their connections.

However, measuring “value” is difficult without clear metrics. To make the system work, these platforms measure the value of content based upon the amount that a user engages with a post.

This is where AI comes into the picture. The social media platform’s content management engine observes user behavior and updates its ranking system accordingly. The posts that receive the most likes, comments, etc. are ranked as more “valuable” and have a higher probability of being shown to users. In contrast, the posts that receive negative feedback (“don’t show me this again”, etc.) are shown less often.
Social Echo Chambers
In theory, this approach should make truly valuable content bubble to the top. In practice, people tend to respond most strongly (i.e. posting comments, likes, complaints, etc.) to content that they feel strongly about. As a result, polarizing content tends to score well under these schemes as people show their support for adorable cats and the political party of their choice and complain about “fake news” (whether or not it is actually fake).

In order to keep users engaged, an AI-based system using user behavior as a metric will naturally create “echo chambers”, where users will only see posts that align with what they already believe. The primary goal of social media platforms is to keep their users happy and engaged, and “echo chambers” are an effective way of achieving this.

The Bottom Line on AI and Social Media

AI is a crucial component of modern social media, but it is important to consider who this AI is really designed to benefit. Social media platforms, like any other business, are driven by the need to make a profit and keep shareholders happy. AI in social media is designed to accomplish this goal by feeding as many ads as possible to their users.

<style>@media screen and (max-width:768px){.mobile-full-width{width: 90%;margin: auto;}}</style>
<style type="text/css">.fusion-fullwidth.fusion-builder-row-1 a:not(.fusion-button):not(.fusion-builder-module-control):not(.fusion-social-network-icon):not(.fb-icon-element):not(.fusion-countdown-link):not(.fusion-rollover-link):not(.fusion-rollover-gallery):not(.fusion-button-bar):not(.add_to_cart_button):not(.show_details_button):not(.product_type_external):not(.fusion-quick-view):not(.fusion-rollover-title-link):not(.fusion-breadcrumb-link) , .fusion-fullwidth.fusion-builder-row-1 a:not(.fusion-button):not(.fusion-builder-module-control):not(.fusion-social-network-icon):not(.fb-icon-element):not(.fusion-countdown-link):not(.fusion-rollover-link):not(.fusion-rollover-gallery):not(.fusion-button-bar):not(.add_to_cart_button):not(.show_details_button):not(.product_type_external):not(.fusion-quick-view):not(.fusion-rollover-title-link):not(.fusion-breadcrumb-link):before, .fusion-fullwidth.fusion-builder-row-1 a:not(.fusion-button):not(.fusion-builder-module-control):not(.fusion-social-network-icon):not(.fb-icon-element):not(.fusion-countdown-link):not(.fusion-rollover-link):not(.fusion-rollover-gallery):not(.fusion-button-bar):not(.add_to_cart_button):not(.show_details_button):not(.product_type_external):not(.fusion-quick-view):not(.fusion-rollover-title-link):not(.fusion-breadcrumb-link):after {color: #f2b310;}.fusion-fullwidth.fusion-builder-row-1 a:not(.fusion-button):not(.fusion-builder-module-control):not(.fusion-social-network-icon):not(.fb-icon-element):not(.fusion-countdown-link):not(.fusion-rollover-link):not(.fusion-rollover-gallery):not(.fusion-button-bar):not(.add_to_cart_button):not(.show_details_button):not(.product_type_external):not(.fusion-quick-view):not(.fusion-rollover-title-link):not(.fusion-breadcrumb-link):hover, .fusion-fullwidth.fusion-builder-row-1 a:not(.fusion-button):not(.fusion-builder-module-control):not(.fusion-social-network-icon):not(.fb-icon-element):not(.fusion-countdown-link):not(.fusion-rollover-link):not(.fusion-rollover-gallery):not(.fusion-button-bar):not(.add_to_cart_button):not(.show_details_button):not(.product_type_external):not(.fusion-quick-view):not(.fusion-rollover-title-link):not(.fusion-breadcrumb-link):hover:before, .fusion-fullwidth.fusion-builder-row-1 a:not(.fusion-button):not(.fusion-builder-module-control):not(.fusion-social-network-icon):not(.fb-icon-element):not(.fusion-countdown-link):not(.fusion-rollover-link):not(.fusion-rollover-gallery):not(.fusion-button-bar):not(.add_to_cart_button):not(.show_details_button):not(.product_type_external):not(.fusion-quick-view):not(.fusion-rollover-title-link):not(.fusion-breadcrumb-link):hover:after {color: #f2b310;}.fusion-fullwidth.fusion-builder-row-1 .pagination a.inactive:hover, .fusion-fullwidth.fusion-builder-row-1 .fusion-filters .fusion-filter.fusion-active a {border-color: #f2b310;}.fusion-fullwidth.fusion-builder-row-1 .pagination .current {border-color: #f2b310; background-color: #f2b310;}.fusion-fullwidth.fusion-builder-row-1 .fusion-filters .fusion-filter.fusion-active a, .fusion-fullwidth.fusion-builder-row-1 .fusion-date-and-formats .fusion-format-box, .fusion-fullwidth.fusion-builder-row-1 .fusion-popover, .fusion-fullwidth.fusion-builder-row-1 .tooltip-shortcode {color: #f2b310;}#main .fusion-fullwidth.fusion-builder-row-1 .post .blog-shortcode-post-title a:hover {color: #f2b310;}</style>

The post AI Series Part 2: Social Media and the Rise of “Echo Chambers” appeared first on Netragard.

by Adriel Desautels at April 15, 2021 07:08 PM

Packet Pushers

What Does Aruba Do With No SASE To Sell? It Changes The Conversation

SASE was clearly on the minds of Aruba executives during its Atmosphere 2021 event, but without an actual SASE offering in its portfolio, Aruba had to steer the conversation toward SD-WAN and Aruba's role as a partner that can help enterprises transform at their own pace.

The post What Does Aruba Do With No SASE To Sell? It Changes The Conversation appeared first on Packet Pushers.

by Drew Conry-Murray at April 15, 2021 04:08 PM

ipSpace.net Blog (Ivan Pepelnjak)

Fundamentals: Is Switching Latency Relevant?

One of my readers wondered whether it makes sense to buy low-latency switches from Cisco or Juniper instead of switches based on merchant silicon like Trident-3 or Jericho (regardless of whether they are running NX-OS, Junos, EOS, or Linux).

As always, the answer is it depends, but before getting into the details, let’s revisit what latency really is. We’ll start with a simple two-node network.

<figure> The simplest possible network <figcaption>

The simplest possible network

</figcaption> </figure>

April 15, 2021 07:47 AM

The Data Center Overlords

Cut-Through Switching Isn’t A Thing Anymore

So, cut-through switching isn’t a thing anymore. It hasn’t been for a while really, though in the age of VXLAN, it’s really not a thing. And of course with all things IT, there are exceptions. But by and large, Cut-through switching just isn’t a thing.

And it doesn’t matter.

Cut-through versus store-and-forward was a preference years ago. The idea is that cut-through switching had less latency than store and forward (it does, to a certain extent). It was also the preferred method, and purchasing decisions may have been made (and sometimes still are, mostly erroneously) on whether a switch is cut-through or store-and-forward.

In this article I’m going to cover two things:

  • Why you can’t really do cut-through switching
  • Why it doesn’t matter that you can’t do cut-through switching

Why You Can’t Do Cut-Through Switching (Mostly)

You can’t do cut-through switching when you change speeds. If the bits in a frame are sent at 10 Gigabits, they need to go into a buffer before they’re sent over a 100 Gigabit uplink. The reverse is also true. You can’t stuff a frame that’s piling into an interface 10 times faster than it’s sending (though it’s not slowed down).

So any switch (which is most of them) that uses a higher speed uplink than host facing port is store-and-forward.

Just about every chassis switch involves speed changes. Even if you’re going from a 10 Gigabit port on one line card to a 10 Gigabit port on another line card, there’s a speed change involved. The line card is connected to another line card via a fabric module (typically), and that connection from line card to fabric module is via a higher speed link (typically 100 Gigabit).

There’s also often a speed change when going from one module to another, even if say the line cards were 100 Gigabit and the fabric module were 100 Gigabit, the link between them is usually a slightly higher speed in order to account for internal encapsulations. That’s right, there’s often an internal encapsulation (such as Broadcom’s HiGig2) that slightly enlarges the frames bouncing around inside of a chassis. You never see it, because the encap is added when the packet enters the switch and removed before it leaves the switch. The speed is slightly bumped to account for this, hence a slight speed change. That would necessitate store-and-forward.

As Ivan Pepelnjak noted, I got this part wrong (about Layer 3 and probably VXLAN, the other reasons stand, however).

You can’t do cut-through switching when doing Layer 3. Any Layer 3 operation involves re-writing part of the header (decrementing the TTL) and as such a new CRC for the frame that packet is encapsulated into is needed. This requires storing the entire packet (for a very, very brief amount of time).

So any Layer 3 operation is inherently store-and-forward.

Any VXLAN is store-and-forward. See above about Layer 3, as VXLAN is Layer 3 by nature.

Any time a buffer is utilized. Anytime two frames are destined for the same interface at the same time, one of them has to wait in a buffer. Any time a buffer is utilized, it’s store-and-forward. That one is hopefully obvious.

So any switch with a higher-speed uplink, or any Layer 3 operations, or when buffers are utilized, and of course when VXLAN is used, it’s automatically store-and-forward. So that covers about 99.9% of use cases in the data center. Even if your switch is capable of cut-through, you’re probably not using it.

It Doesn’t Matter That Everything Is (Mostly) Store-and-Forward

Network engineers/architects/whathaveyou of a certain age probably have it engrained that “cut-through: good” and “store-and-forward: bad”. It’s one of those persistent notions, that may have been true at one time (though I’m not sure cut-through was ever that advantageous in most cases), but no longer is. The notion that Hardware RAID is better than software RAID (isn’t not anymore), LAGs should be powers of 2 (not a requirement on most gear), Jumbo frames increase performance (miniscule to no performance benefit today in most cases), MPLS is faster (it hasn’t been for about 20 years) are just a few that come to mind.

“Cut-through switching is faster” is technically true, and still is, but it’s important to define what you mean by “faster”. Cut-through switching doesn’t increase throughput. It doesn’t make a 10 Gigabit link a 25 Gigabit link, or a 25 Gigabit link a 100 Gigabit link, etc. So when we talk about “faster”, we don’t mean throughput.

What it does is cut the amount of time a frame spends in a single switch.

With 10 Gigabit Ethernet a common speed, and most switches these days supporting 25 Gigabit, the serialization delay (the amount of time it takes to transmit or receive a frame) is miniscule. The port-to-port latency of most DC swtiches is 1 or 2 microseconds at this point. Compared to other latencies (app latency, OS network stack latency, etc.) this is imperceptible. If you halved the latency or even doubled the latency, most applications wouldn’t be able to tell the difference. Even benchmarks wouldn’t be able to tell the difference.

Cutting down the port-to-port latency was the selling point of cut-through switching. A frame’s header could be leaving the egress interface while it’s tail-end was still coming in on the ingress interface. But since the speeds are so fast, it’s not really a significant cause of communication latency. Storing the frame/packet just long enough to get the entire frame and then forward it doesn’t cause any significant delay.

From iSCSI to VMotion to SQL to whatever, the difference between cut-through and store-and-forward is unmeasurable.

Where Cut-Through Makes Sense

There are a very small number of cases where cut-through switching makes sense, most notably high-frequency trading. In these rare cases where latency absolutely needs to be cut down, cut-through can be achieved. However, there’s lots of compromises to be made.

If you want cut-through, your switches cannot be chassis. They need to be top-of-rack switches with a single ASIC (no interconnects). The interface speed needs to be the same throughout the network to avoid speed changes. You can only do Layer 2, no Layer 3 and of course no VXLAN.

The network needs to be vastly overprovisioned. Anytime you have two packets trying to leave an interface at the same time, one has to be buffered, and that will dramatically increase latency (far beyond store-and-forward latency). The packet sizes will also need to be small as to reduce latency.

Too-Long; Didn’t Read

The bad news is you probably can’t do cut-through switching. But the good news is that you don’t need to.

by tonybourke at April 15, 2021 05:56 AM

April 14, 2021

Packet Pushers

Cloud Networking Startup Alkira Spins Up In Azure Marketplace

Cloud networking startup Alkira announced that it’s been selected for the “Microsoft for Startups” program. Microsoft offers the program to emerging companies to provide “technology and business support designed to help B2B startups quickly scale.” As part of the program, Alkira will get ecosystem support from Microsoft such as “access to technical, sales and marketing […]

The post Cloud Networking Startup Alkira Spins Up In Azure Marketplace appeared first on Packet Pushers.

by Drew Conry-Murray at April 14, 2021 03:49 PM

ipSpace.net Blog (Ivan Pepelnjak)

Netsim-tools Release 0.5 Work with Containerlab

TL&DR: If you happen to like working with containers, you could use netsim-tools release 0.5 to provision your container-based Arista EOS labs.

Why does it matter? Lab setup is blindingly fast, and it’s easier to integrate your network devices with other containers, not to mention the crazy idea of running your network automation CI pipeline on Gitlab CPU cycles. Also, you could use the same netsim-tools topology file and provisioning scripts to set up container-based or VM-based lab.

What is containerlab? A cool project that builds realistic virtual network topologies with containers. More details…

April 14, 2021 07:26 AM

XKCD Comics

April 13, 2021

Packet Pushers

In Defense Of EIGRP With Zig Zsiga And Ethan Banks – Video

Zig Zsiga and Ethan Banks talk through use cases for the sometimes maligned EIGRP, a popular choice in Cisco networks for decades. The conversation covers EIGRP design basics, the stuck-in-active problem, stub routing, and RFC7868. Comparisons are made to how OSPF design differs to accomplish similar goals. This was originally published as an audio-only podcast […]

The post In Defense Of EIGRP With Zig Zsiga And Ethan Banks – Video appeared first on Packet Pushers.

by The Video Delivery at April 13, 2021 06:39 PM

What’s In A Title? Network Engineer Vs. Professional Or Licensed Engineer

In the US, do not call yourself a "Professional Engineer" or "Licensed Engineer" as your title. Those are specially reserved titles for those who actually ARE licensed. However, calling yourself a "Network Engineer" is okay. If you want to know more details, read on.

The post What’s In A Title? Network Engineer Vs. Professional Or Licensed Engineer appeared first on Packet Pushers.

by Ed Horley at April 13, 2021 02:55 PM

ipSpace.net Blog (Ivan Pepelnjak)

Must Read: Automate Nexus-OS Fabric Deployment

Some networking engineers breeze through our Network Automation online course, others disappear after a while… and a few of those come back years later with a spectacular production-grade solution.

Stephen Harding is one of those. He attended the automation course in spring 2019 and I haven’t heard from him in almost two years… until he submitted one of the most mature data center fabric automation solutions I’ve seen.

Not only that, he documented the solution in a long series of must-read blog posts. Hope you’ll find them useful; I liked them so much I immediately saved them to Internet Archive (just in case).

April 13, 2021 06:51 AM

April 12, 2021

Packet Pushers

Complexity Of Networking Architecture In The 2020’s

Get the parts of network strategy right today is a daunting task when you think about it

The post Complexity Of Networking Architecture In The 2020’s appeared first on Packet Pushers.

by Greg Ferro at April 12, 2021 06:31 PM

My Etherealmind

Packet Pushers LiveStream – Alkira and Multi-cloud Networking

We are doing the first ever Packet Pushers LiveStream on Thursday, April 22nd 1000PST/1300CET/1700GMT. Our take on a LiveStream is a cross between live podcast, presentation and interviews where the audience  can join us live for recording.   Our sponsor is Alkira and Multi-Cloud Networking. The Alkira product is interesting in its ability to build network […]

by Greg Ferro at April 12, 2021 01:18 PM

Ethan Banks on Technology

Why Being A Late Technology Adopter Pays Off

As a technologist helping an organization form an IT strategy, I’m usually hesitant to recommend new tech. Why? Because it’s new. Adopting technology early in its lifecycle is a risky endeavor. For most organizations, I find that shiny new tech isn’t worth the risk.

Emerging products and protocols are often accompanied by great fanfare. Talks are delivered at conferences, whitepapers are written, and Gartner Cool Vendor designations are awarded. The idea is to make you and me believe that this new tech solves a problem in a novel way that’s never been done before. This is the thing we’ve been waiting for. This is so much better than it used to be in the bad old times. Right. I’m sure it is.

Despite my cynical tone, I am hopeful when it comes to new tech. I really am. In part, technologists are employed because of tech’s ever-changing landscape. But I am also dubious during any technology’s formative years. I take a wait-and-see approach, and I’ve never been sorry for doing so. I believe that being a late, not early, adopter of technology pays off for most organizations.

You Aren’t Stuck With Abandoned Tech

If you adopt early, you are hoping that the tech takes off. Maybe it will, but often it doesn’t. If the tech never sees broad market adoption, vendors are likely to abandon it. Then you’re stuck with it, as any tech you implement creates technical debt that’s hard to get rid of.

There are many reasons technology is abandoned.

  • Sometimes, standards are created that compete with other standards. One wins. Another doesn’t. How’s that TRILL fabric working out for you? Okay for a select few, but not many.
  • Startups build interesting products, but those products don’t always make it in the marketplace. The founders might have to give up if there’s not enough money coming through the door.
  • Sometimes, a startup is acquired, and the new parent company kills the product either purposely or by ignoring it.

Being stuck with technology no one is developing isn’t the end of the world if it’s working for you, but it’s a tradeoff. You’re stuck with the security holes, lack of features, and other quirks that are never going to get fixed. I suppose this is an argument for open source, since an organization could throw developers at the abandoned project if needed. But realistically, few organizations are able to do this.

You Avoid The Burden Of Beta Testing

When a new product comes to market, it’s not fully formed. There will be bugs. There will be features you want that aren’t there and won’t be there for a while. You are, in effect, helping to develop the product. You’re an unwitting beta tester, and you’re paying for privilege. For some companies, that’s actually what they want. They want to provide input to the vendor to help steer the direction of the product.

For most folks, beta testing is actually not what they’re looking for. For me, the issue is time. I don’t have the time to shepherd the product along. I need the product to work as advertised out of the gate, and not spend half my time frustrated by non-existent (but necessary) features, logging bugs, or finding workarounds.

I want to implement and operate the technology. Not test it. I’m happy to let other folks take on that thankless task. I’ll adopt later in the product’s lifecycle when it’s grown up a bit.

You Don’t Have To Figure It Out For Yourself

If you rely on training to get a handle on a new product, good luck when the product is new. Official vendor training often doesn’t exist early-on. Third-party training gets created due to market demand, but most new tech doesn’t have much demand yet…because it’s new. 🐥🥚 It will take time for the product to become known to the market and demand to ramp.

If not training, then perhaps documentation, right? Some vendors will, as a standard part of their product creation process, release robust documentation for that product. But in other cases, that documentation will be…less robust, shall we say.

If no training or documentation, then vendor support, right? Um, you know how you’re calling support because you’re struggling to find the answers due to a lack of docs? The vendor support team is likely to be in the same situation. You might have a hard time finding a support engineer with the answers.

Adopting technology later in its lifecycle means that you’re more likely to find community support, decent documentation, or training for the product in question.

Vendors Solutions Are More Likely To Interoperate

When a new protocol comes to market, the standards are not always rigidly defined. For IETF RFCs, there’s a pure evil in the words “MUST” vs. “SHOULD” vs. “MAY”. When an RFC reads “SHOULD”, that means the specific point is optional practically speaking, despite being recommended. You should do it. It’s the right thing to do. But you don’t have to do it. There’s usually a behind-the-scenes reason an RFC specifies should instead of must. Someone involved with the writing of the RFC didn’t want to have to while someone else knew it was proper. Read IETF BCP 14 for more detail on how these words are meant to be adhered to.

The openings left in some standards can lead to vendor interoperability. The shiny new thing as implemented by Vendor C doesn’t work the same as Vendor J’s implementation. If you’re all in on Vendor J, no problem. But if you have islands of Vendor C that need to work with islands of Vendor J, there are problems where the island edges meet.

Sometimes, these are long-standing problems. For instance, there are vendor interoperability challenges and/or competing standards with segment routing and EVPN that have been ongoing for years now. Sometimes these differences are philosophical and unlikely to be resolved. Other times, the differences might simply be time-related–one vendor implemented a new feature sooner than another vendor.

If you wait a while to allow the technology to mature, you might avoid interop headaches. And by headaches, I mean the inevitable workaround we technologists have to come up with to make the thing work that should have been working in the first place had the standard actually been more than a suggestion.

Your Organization Can Skip Needless Disruption

Change, in general, is disruptive. Changing to a new technology is certainly disruptive. Adopting the tech has operational impact. Transitioning legacy infrastructure to the new tech is risky. Business stakeholders might need training to use the new tech, and certainly the front line IT support staff will have to ramp up. And let’s not forget the dollar cost with its supposed ROI disrupting the budget.

New tech is not a casual undertaking. Is it worth disrupting the business? Is there a payoff? Late adopters are more likely to know the answer to this question, because they can examine the stories of those who have gone before them.

I believe part of SD-WAN’s success in the marketplace is a result of early adopters working out the difficulties, finding the victories, and now being able to conclusively explain how SD-WAN technology is a better solution than what they had before. The disruption to cut over remote offices and workers, and cloud connectivity to the new model paid off.

By the same token, I think many who have adopted public cloud too early have experienced needless disruption. The ill-considered lift-and-shift approach many organizations took early on has resulted in additional IT spending (not a reduction), major operational change, and massive IT staff stress. In some cases, workloads have come back from the public cloud to private data centers, a phenomenon known as cloud repatriation. A late adopter approach would have made more sense as the market remains, even now, unsettled as to what “cloud” is actually supposed to look like.

Examine The Tradeoffs

One argument against late adoption is that of opportunity cost. New tech could give a business a leg up on a competitor (faster to market), improve the bottom line (reduction in cost), or improve the top line (increased sales). Could it be that your organization is giving up revenue by delaying adoption? Perhaps, but I think this fear is overblown most of the time. I’ve seen many abandoned projects in companies over the years, where the organization would have been better off waiting to see if the new thing was going to become an industry mainstay.

I’m not saying that early adoption of new tech is never worth the risk. Once a while, I suppose it is. Still, new technology is rarely as transformational as it is in PowerPoint. You can probably wait a little longer.

For an alternate viewpoint, check out my friend Chris Wahl’s take, Power to the Early Adopters. Chris says, “I wanted to point out a few of the good things that come from being an early adopter from someone who contributed towards a product that was adopted early by numerous folks.”

by Ethan Banks at April 12, 2021 12:00 PM

ipSpace.net Blog (Ivan Pepelnjak)

Start Automating Public Cloud Deployments with Infrastructure-as-Code

One of my readers sent me a series of “how do I get started with…” questions including:

I’ve been doing networking and security for 5 years, and now I am responsible for our cloud infrastructure. Anything to do with networking and security in the cloud is my responsibility along with another team member. It is all good experience but I am starting to get concerned about not knowing automation, IaC, or any programming language.

No need to worry about that, what you need (to start with) is extremely simple and easy-to-master. Infrastructure-as-Code is a simple concept: infrastructure configuration is defined in machine-readable format (mostly text files these days) and used by a remediation tool like Terraform that compares the actual state of the deployed infrastructure with the desired state as defined in the configuration files, and makes changes to the actual state to bring it in line with how it should look like.

April 12, 2021 06:14 AM

XKCD Comics

April 11, 2021

My CCIE Training Guide

Building ESXi 6.7u3 over Mini PC PN50

Today I have decided enough is enough, it had been long overdue since I have had my own home lab however I didn't want to go overboard crazy expensive setup! 

I have at home NAS from asustor https://www.asustor.com/en/product?p_id=62 AS5304T that i use for home backup and external storage.

For network I have https://www.netgear.com/home/wifi/mesh/rbk852-1/ AX6000 that provide both wired and wireless coverage throughout my apartment. 

And today I have added to the setup also this nice Mini PC PN60 from asus that arrive bare bone with no memory or hard drive, so I added 32Gb RAM and 512GB NVMe from XPG.

first I thought well that would be simple installation but I was wrong, other then hardware install that was really easy and smooth I have ran into 2 main challenges that i would like to share the solution hopefully it would make your life a bit easier, in order for that to happen you would need to prepare a bootable USB and install over it your image:

Challenge #1 :

Network card identification

In order to overcome the issue

  • make sure you download the 6.7update3 zip bundle from vmware.com 
  • For the next part make sure you Open Powershell and run:
    • Import-Module PowerShellGet
    • Install-Module -Name VMware.PowerCLI -AllowClobber
  • Post Installation:
  • Cd to dir that contain both realtek and the zip bundle and run:
    • Add-EsxSoftwareDepot .\net55-r8168-8.045a-napi-offline_bundle.zip, .\VMware-ESXi-6.7.0-XXXX-depot.zip
  • Get Imported Profiles:
    • Get-EsxImageProfile
  • Create New Profile:
    • New-EsxImageProfile -CloneProfile ESXi-6.7.0-xxxxxx-standard -name ESXi-6.7.0-xxxxxx-standard-<YourName> -Vendor <YourName>
  • Accept your new Profile:
    • Set-EsxImageProfile -ImageProfile ESXi-6.7.0-xxxxxx-standard-<YourName> -AcceptanceLevel CommunitySupported
  • Validate you see your newly created profile:
    • Get-EsxImageProfile
  • Identify your the name of the new driver
    • Get-EsxSoftwarePackage | Where {$_.Vendor -eq "Realtek"}
  • Apply the driver to the image:
    • Add-EsxSoftwarePackage -ImageProfile ESXi-6.7.0-xxxxxx-standard-<YourName> -SoftwarePackage <eg. net55-r8168>
  • Create your new ISO file:
    • Export-EsxImageProfile -ImageProfile ESXi-6.7.0-xxxxxx-standard-<YourName> -ExportToIso -filepath .\ESXi-6.7.0-xxxxxx-standard-<YourName>.iso

If you made it this far you have a new custom ISO file that is ready to be used and deployed over a USB drive.

To make a bootable USB drive I used my mac but you can use mac / windows or Linux to generate a bootable USB with the iso generated.

Nice guides: 

Bootable ESXI USB over Mac 

Bootable ESXI USB over Linux

Bootable ESXI USB over Windows

Now for challenge #2, although you have a bootable USB now it would not work until you would resolve the 2nd issue with NVMe Identification I use https://www.xpg.com/us/xpg/583 by default it would not be identified by the ESXi installation and the solution is surprisingly simple:

  • You will need to download ESXi 6.5 update02 it is critical not to download later version as it may not identify the storage device
  • Open iso with any means you have and extract file name NVME.V00 and simply replace the file over the USB you created. 
Now your USB is ready and installation would run smooth , Next Next Next Done...

by cciep3 (noreply@blogger.com) at April 11, 2021 03:42 PM

ipSpace.net Blog (Ivan Pepelnjak)

Worth Reading: Fail-Fast is Failing... Fast

Here’s an interesting fact: cloud-based stuff often refuses to die; it might become insufferably slow, but would still respond to the health checks. The usual fast failover approach used in traditional high-availability clusters is thus of little use.

For more details, read the Fail-Fast is Failing… Fast ACM Queue article.

April 11, 2021 07:12 AM

April 10, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Worth Reading: Data Manipulation in Jinja2

Ansible and Jinja2 are not an ideal platform for data manipulation, but sometimes it’s easier to hack together something in Jinja2 than writing a Python filter. In those cases, you might find the Data Model Transformation with Jinja2 by Philippe Jounin extremely useful.

April 10, 2021 07:41 AM

April 09, 2021

Ethan Banks on Technology

When Stretching Layer Two, Separate Your Fate

On the Packet Pushers YouTube channel, Jorge asks in response to Using VXLAN To Span One Data Center Across Two Locations

<iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="allowfullscreen" frameborder="0" height="281" loading="lazy" src="https://www.youtube.com/embed/s9J3-tK0Ink?feature=oembed" title="Using VXLAN To Span One Data Center Across Two Locations" width="500"></iframe>

if stretching the layer 2 is not recommended, then what is the recommendation if you need to fault over to a different physical location and still got to keep the same IP addresses for mission critical applications?


That video is a couple of years old at this point, and I don’t recall the entire discussion. Here’s my answer at this moment in time. If DCI is required (and I argue that it shouldn’t be in most cases), look at VXLAN/EVPN. EVPN is supported by several vendors. If you are a multi-vendor shop, watch for EVPN inter-vendor compatibility problems. Also look for vendor EVPN guides discussing the use case of data center interconnect (DCI).

Also be aware (and beware) of vendor-proprietary DCI technologies like Cisco’s OTV. I recommend against investing in OTV and similar tech unless you already have hardware that can do it and can turn the feature on for free. Otherwise, my opinion, for what it’s worth, is to stick with an EVPN solution. EVPN is a standard that’s been running in production environments for years.

EVPN is complex. There are tradeoffs. You could talk me out of it depending on the scenario. But at the moment, it’s a design I favor because it’s broadly supported across the industry and scalable.

More High Level Detail

Jorge is describing a situation where the network needs to support a badly designed application that’s tightly coupled to an IP address. No application should be tightly coupled to an IP address. This common issue should really be solved by application architects rebuilding the app properly instead of continuing like it’s 1999 while screaming YOLO.

I know there are really old legacy apps that need that L2 adjacency for redundancy or can’t re-IP. I know there are apps that can’t be redesigned because reasons. Fair enough. But what I’d like to ask the business stakeholders is…do you really want to have a critical business function rely on an application that can break when something as ephemeral as an IP address changes? You really don’t, and so I see this as more of a business problem than a technology problem. Your business is tied to an inflexible app. That’s bad for business.

But back to the reality Jorge and many of us face. The business stakeholders are more likely to say, “Make it work,” sticking engineers with a horrifying network design requirement–stretching L2 between physical locations.

Avoid Fate Sharing

The big idea is to support the same IP address in multiple locations, but to NOT have fate-sharing, where a problem like a bridging loop and resulting broadcast storm at one site would take down the other site. That means we can’t just throw up a tagged VLAN link (trunk) between the DCs. Instead, we have to divide the L2 broadcast domain (the VLAN) into different L2 domains separated by a routed segment. This way we’ve created two failure domains that will not share fate.

But this introduces a problem, because now hosts in the separate data centers think they are in the same L2 broadcast domain…but aren’t. Therefore, hosts can’t discover each other to send Ethernet frames back and forth, because ARP broadcasts don’t go any further than the router at the edge of each location.

Tunnels All The Way Down

That means we need a layer on top that connects the two separate L2 domains together, while maintaining that sweet L3 separation. What’s that layer? A tunnel. A tunnel that can encapsulate an entire Ethernet frame and carry it from the L2 domain in the one data center to the L2 domain in the other data center. An encapsulation format designed to do that and commonly used in data centers is VXLAN.

Note that there are other encapsulation formats that can also do this such as NVGRE and Geneve, Geneve seeing increased use lately.

Okay…but now we have another problem. How do we know where a VXLAN tunnel should begin and end? That is, where are the VXLAN tunnel endpoints (VTEPs)? And what if we have a bunch of VLANs we want to stretch between data centers like this, because of course we do? How do we track which VXLAN tunnel is carrying traffic for which VLAN and where we should be dropping off these Ethernet frames as they pop in and out of tunnels so that they make it to their destination?

Endpoints Everywhere

Well…you could code all that by hand. (Ha ha ha ha…NO.) Or…you could rely on multicast to do VXLAN advertisement and discovery aka flood and learn. (Um, multicast…arguably, that’s a big request if you’re not already running multicast for other reasons.) Or…you could build out forwarding tables using EVPN. EVPN uses BGP to advertisement what MAC addresses are reachable via what VXLAN tunnels.

Note that EVPN isn’t limited to using VXLAN tunnels for transport. MPLS is another data plane commonly used. VXLAN just happens to be our context here.

Faking It

Another piece of the puzzle is recognizing that the L2 hosts in each data center have no knowledge of VXLAN tunnels or EVPN advertisements. Hosts still expect to put an ARP on the wire and get back a response so that they know the MAC to put in the destination address field of the Ethernet frame they’re building.

The short answer is that the VXLAN/EVPN solution is going to handle that, too. There’s a few different things related to ARP that can happen which are beyond our scope today, but the point is that…ARP is handled. The solution will fake it so that a host is none the wiser that the other host they are trying to communicate with is in some other data center many miles away.

Complexity Breeds Fragility

There are traffic optimization problems that crop up when stretching L2. Search for “DCI traffic trombone” to do some reading on that issue and possible solutions.

And of course, with any technology as complex as EVPN, you’re introducing to the network something else that can break. This is why I’m inclined to push the problem of stretching L2 back to application architects. The problem isn’t the network. It’s the app. Fix the app.

It’s worth pointing out that public cloud vendors don’t let you stretch L2 around the cloud. Or if they do, what’s going on under the hood isn’t simply stretching L2. For example, Ivan Pepelnjak has a 2019 post on this for the Azure cloud. Why do you suppose the public cloud vendors don’t support this? Because they are hosting many customers on a shared infrastructure. Fate sharing is not an option. Therefore, you’ll host your app on their infrastructure their way, and they won’t let you do dumb things. They can’t afford the fragility.

Keep An Open Mind

As a parting thought, it’s worth pointing out that there are other ways to make IPs appear where you need them to. For instance, you can forget about stretching L2 domains at all, and look instead at IP mobility options. Simple host routing is one way to tackle this, albeit a hardware-intensive one at scale.

The larger point is that it’s important to keep an open mind about how you can solve the problem you’re trying to solve. Stretching L2 is different from host routing which is different from anycast which is different from application delivery controllers which is different from CDNs. All of these (and other) networking technologies (hi there, DNS) might come into play to make a service (far more important than an IP) appear where you need it to appear.

While there are pros and cons to each approach, of course, know you have many design options to make an application highly available. Don’t get locked into thinking that you have to have a DCI solution because someone who doesn’t know any better told you to stretch a VLAN.

If you’d like to learn more about EVPN, the Packet Pushers offer lots of podcasts and blogs on EVPN where you can learn more about it. For free. No sign ups, data harvesting, select all the traffic lights, or other nonsense. Just go listen or read.

by Ethan Banks at April 09, 2021 03:54 PM

ipSpace.net Blog (Ivan Pepelnjak)

Bringing New Engineers into Networking on Software Gone Wild

As I started Software Gone Wild podcast in June 2014, I wanted to help networking engineers grow beyond the traditional networking technologies. It’s only fitting to conclude this project almost seven years and 116 episodes later with a similar theme Avi Freedman proposed when we started discussing podcast topics in late 2020: how do we make networking attractive to young engineers.

Elisa Jasinska and Roopa Prabhu joined Avi and me, and we had a lively discussion that I hope you’ll find interesting.

April 09, 2021 06:34 AM

XKCD Comics

April 08, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Claim: You Don't Have to Be a Networking Expert to Do Kubernetes Network Security

I was listening to an excellent container networking podcast and enjoyed it thoroughly until the guest said something along the lines of:

With Kubernetes networking policy, you no longer have to be a networking expert to do container network security.

That’s not even wrong. You didn’t have to be a networking expert to write traffic filtering rules for ages.

April 08, 2021 06:14 AM

April 07, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Reader Question: What Networking Blogs Would You Recommend?

A junior networking engineer asked me for a list of recommended entry-level networking blogs. I have no idea (I haven’t been in that position for ages); the best I can do is to share my list of networking-related RSS feeds and the process I’m using to collect interesting blogs:


  • RSS is your friend. Find a decent RSS reader. I’m using Feedly – natively in a web browser and with various front-ends on my tablet and phone (note to Google: we haven’t forgotten you killed Reader because you weren’t making enough money with it).
  • If a blog doesn’t have an RSS feed I’m not interested.

April 07, 2021 07:40 AM

XKCD Comics

April 06, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Free Exercise: Build Network Automation Lab

A while ago, someone made a remark on my suggestions that networking engineers should focus on getting fluent with cloud networking and automation:

The running thing is, we can all learn this stuff, but not without having an opportunity.

I tend to forcefully disagree with that assertion. What opportunity do you need to test open-source tools or create a free cloud account? My response was thus correspondingly gruff:

April 06, 2021 07:48 AM

April 05, 2021

Ethan Banks on Technology

It’s Not What You Say. It’s How You’re Heard.

In written communication, technical people can sometimes come across as impolite. I see this on Slack (talking down), Twitter (the angry tweeter), in emails (blunt and terse), in blog comments (bitter sarcasm or pedantry), Hacker News discussions (aggressive confrontation), and other places IT builders gather online.

Perhaps you, as just such a technical person, don’t mean to be impolite. Maybe your focus is on efficiency. Get to the point. Say what needs saying, however it comes out. Click send. Job done. Go back to facepalming at the Swagger docs explaining this ill-considered API you need to use.

Here’s the problem with your communications approach. To the person receiving your missive, you might sound like you’re upset. Or tone-deaf. Or maybe just a jerk. You’re presumably none of those things, at least not intentionally. We’re all nice folks who want to get along with our fellow humans, right?

It’s not what you say. It’s how you’re heard.

You need to communicate in such a way that you’re heard as you mean to be heard. If you’re not good at this and want to be, you can improve your messaging.

Before hitting send, engage in role reversal. If you received a message such as the one you’re about to send, how would you perceive it?

A big part of putting yourself in the reversed role is context. As the sender of the message, you have your own internal context that justifies your phrasing. The receiver probably doesn’t have that context. Therefore, you have to think about receiving the message without your own context to fill in the blanks. Would you be offended by the message? Mystified? Irritated? Feel put upon?

If role reversal reveals context problems, edit your message to clearly communicate what you mean to say. The recipient will know where you’re coming from, which will help them interpret your message correctly.

Ask another human to read your message before sending. Ideally, this human is someone you respect for their tactful, thoughtful communication style. If someone just popped to mind and you sighed inwardly…perfect. That’s the human.

Think hard about tone. Your choice of words can soften a blow or sharpen a knife. Indicate sarcasm or share humor. Encourage the reader or depress them.

  • If your tone is angry, the recipient will feel attacked or threatened, even if they are not the object of your wrath.
  • If your tone is cheerful, the reader is more likely to be open to your words, even if you are sharing something negative.
  • If your tone is balanced and considered, you’ll be perceived as thoughtful and trustworthy or knowledgeable.
  • If your tone is clipped and terse, the recipient will read between the lines using what few words you’ve written as clues.
  • If your tone is sarcastic, that might be misinterpreted by the reader if they don’t know you well or lack “sarcasm sense.”

While we can have a robust discussion on the role emojis play in formal writing 😳, I find that for most digital comms, emojis are an effective way to establish tone. Use them. It’s okay. 👍

Communication builds bridges…or burns them down.

Take the time to communicate well. In the age of remote work, your writing in whatever medium is a digital description of who you are. Make sure readers get the right idea.

For a podcast discussion about effective technical communication, listen to Heavy Networking Episode 568, where I interview Drew Conry-Murray, a career technical writer and editor, and the content director at Packet Pushers. This episode is less about tone and perception and more about helping non-nerds understand nerd-speak. The podcast is free. You don’t have to sign up or fill out a form or select all the boats. Just go listen and hopefully get something useful for the time spent.

by Ethan Banks at April 05, 2021 01:54 PM

ipSpace.net Blog (Ivan Pepelnjak)

Building Unnumbered Ethernet Lab with netsim-tools

Last week I described the new features added to netsim-tools release 0.4, including support for unnumbered interfaces and OSPF routing. Now let’s see how I used them to build a multi-vendor lab to test which platforms could be made to interoperate when running OSPF over unnumbered Ethernet interfaces.

I needed to define an unnumbered addressing pool first:

    unnumbered: true

I wanted to run OSPF on all devices in the lab:

module: [ ospf ]

April 05, 2021 05:57 AM

XKCD Comics

April 02, 2021

ipSpace.net Blog (Ivan Pepelnjak)

Video: Why Do We Need Kubernetes?

Have you ever wondered what the Kubernetes fuss is all about? Why would you ever want to use it? Stuart Charlton tried to answer that question in the introduction part of his fantastic Kubernetes Networking Deep Dive webinar.

You need Free ipSpace.net Subscription to watch the video.

April 02, 2021 06:35 AM

April 01, 2021

Network Design and Architecture

Cisco and Juniper Acquired IETF, all the RFCs name will be converted to JuCi

HAPPY April fools’ Day 2021 😊

(Respect to RFC 1925 By the way. (April 1st RFC))


While you are on the website, just know that we have so many Networking Courses as well!

The post Cisco and Juniper Acquired IETF, all the RFCs name will be converted to JuCi appeared first on orhanergun.net.

by Orhan Ergun at April 01, 2021 02:48 PM

Packet Pushers

ASIC Maker Innovium Announces SONiC-Certified Switches For The Cloud And Large Enterprises

Innovium, which makes ASICs to compete with Broadcom and others, is now offering a menu of switches with the SONiC network OS pre-installed. It's a clever opportunity for Innovium to boost its appeal in the whitebox/disaggregation market while also moving its own silicon.

The post ASIC Maker Innovium Announces SONiC-Certified Switches For The Cloud And Large Enterprises appeared first on Packet Pushers.

by Drew Conry-Murray at April 01, 2021 02:37 PM

ipSpace.net Blog (Ivan Pepelnjak)

Planning the Extended Coffee Break: Three Months Later

It’s almost exactly three months since I announced ipSpace.net going on an extended coffee break. We had some ideas of what we plan to do at that time, but there were still many gray areas, and thanks to tons of discussions I had with many of my friends, subscribers, and readers, they mostly crystallized into this:

You’re trusting me to deliver. We added a “you might want to read this first” warning to the checkout process, and there was no noticeable drop in revenue. Thanks a million for your vote of confidence!

April 01, 2021 06:48 AM